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The Compound Hypergeometric Distribution and 


a System of Single Sampling Inspection Plans 


Based on Prior Distributions and Costs 


A. Hap 


University of Copenhagen 


The main results of this paper, apart from the limit theorems, were given in two lec- 
tures at the University of London, and in a seminar at Imperial College of Science and 
Technology, London, January 1959. The complete paper was presented for discussion 
in a meeting at Imperial College in February 1960. The paper reviews present sampling 
inspection plans for attributes placing particular emphasis on their underlying assump- 
tions. A model is then proposed based upon prior distributions and costs, and optimum 
sampling plans are derived which minimize the average costs for any prior distribu- 
tion. Tables and examples are provided. 
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1. INTRODUCTION AND SUMMARY 


The present paper consists of two main parts. The one part, given in sections 
7 and 8, is essentially probability theory and gives a number of important 
general theorems for the compound hypergeometric and the compound binomial 
distribution. These two sections may be read independently of the remainder 
of the paper. 

The other main part contains first in sections 2 to 5 a survey of some widely 
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used concepts and systems of sampling inspection plans for attributes. The 
survey is biased since its purpose is to point out that all sampling plans im- 
plicitly assume the existence of a prior distribution and certain costs. These 
implicit assumptions are analysed and the arbitrariness of existing plans is 
pointed out and discussed in relation to the prior distribution and the costs. 
Next a model based on prior distributions and costs is formulated in section 6. 
The model is very simple in economic respects since it consists of only three 
terms: (1) the costs of sampling inspection, (2) the loss due to accepted defec- 
tives, and (3) the costs of rejected lots. This simplicity, however, makes it possible 
to limit the number of cost-parameters to one or two. Similar models have been 
set up by several other authors during the last few years. 

In section 9 the general solution is given to the problem of determining the 
optimum sampling plan, ie. the plan minimising the cost function, for any 
prior distribution. The solution is based on the theorems developed in sections 
7 and 8. It is further shown how a system of optimum plans may easily be tabu- 
lated for any compound binomial distribution as prior distribution. As a “large- 
sample” result of general interest it has been proved that sample size should 
increase proportional to the logarithm of lot size if the compound binomial prior 
distribution has a discrete limit for large lots whereas in the continuous case 
sample size should increase proportional to the square root of lot size. Sections 
10-12 contain detailed results and numerical examples for the rectangular, the 
Polya, and the mixed binomial distributions as prior distributions. 

Besides drawing attention to the prior distribution and the cost function as 
the foundation for the development of sound sampling inspection plans it is 
pointed out here that a good system of sampling plans should include a feed- 
back mechanism to keep the plan up to date with regard to changes in the 
prior distribution and the cost parameters. This important aspect of the theory 
of sampling inspection is, however, not discussed in the present paper. 


2. THE OPERATING CHARACTERISTIC 


Since old times sampling inspection has been used from the point of view 
that the function of inspection is to provide information on quality of delivered 
goods and feed that information back to the supplier so that he is constantly 
reminded that good quality is important to the consumer. With that attitude 
to inspection there will be no definite decision rule as regards acceptance or 
rejection of submitted lots but in doubtful cases there will be negotiations 
between supplier and consumer about the consequences of unsatisfactory lots. 
This system is still the most widespread and where it is used it is often found that 
the inspector on the one hand is not fully aware of the limited information in 
small samples and on the other hand relies heavily on the psychological effect 
of inspection, an effect which undoubtedly exists in the sense that the prior 
distribution (the distribution of submitted lots according to fraction defective) 
and the inspection system are interdependent. 

One of the reasons why the classical system does not contain any definite 
decision rule is that 30 years or more ago there was not any widespread 
knowledge of how probability theory could be used to predict the consequences 
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of applying such rules as for instance with regard to the frequency of accepting 
lots of various quality. However, since Dodge and Romig (1) in their fundamental 
paper from 1929 introduced the concepts of the lot tolerance per cent defective, 
the consumers risk, the process average and the average amount of inspection 
and showed how to express these quantities by the lot size, the sample size and 
the acceptance number, the situation has changed. Of the concepts developed 
the most popular and useful has proved to be the operating characteristic. 

A single sampling plan is defined by three numbers: the lot size N, the 
sample size n, and the acceptance number c, and the following decision rule: 
accept the lot if the number of defectives in the sample is equal to or less than 
the acceptance number, otherwise reject the lot. For a lot with fraction defective 
p = X/N the probability of acceptance is 


pad = EMR 2)/ Gt): @ 


Plotting the acceptance probability as a function of the fraction defective of 
the inspection lot we get the operating characteristic (OC) curve of the given 
sampling plan which completely describes the discriminating power of the plan, 
i.e. its ability to discriminate between lots with low and high fractions defectives, 
respectively. 

Any plan will give the supplier a strong incentive to maintain his quality at 
a level where the probability of acceptance is high and in that way the sampling 
plan influences the prior distribution. Just as the plan determines the operating 
characteristic, a given operating characteristic may be used for determining 
the corresponding plan. From a purely statistical point of view that is usually 
done by choosing two points on the OC-curve which lead to two equations for 
the determination of n and c. If the inspector knows what operating characteristic 
he wants the corresponding plan is fully determined. The word “inspector’’ is 
here used to designate the person who chooses the sampling plan. He may be 
a quality control engineer or a statistician. 

The two points on the OC-curve are usually called the producers and consumers 
risk points with their associated risks. The producers’ risk point consists of a low 
fraction defective having a high acceptance probability, 0.95 say, and conse- 
quently a small risk (the producers risk) for rejection. The consumers risk point 
consists of a high fraction defective with a low acceptance probability (the con- 
sumers risk), 0.10 say. Tippett (2) describes this set-up in the following terms: 
“One procedure postulates rather narrow-minded producers and consumers 
whose imagined interests are protected independently. It is in the interest of 
the producer to ensure only that an undue proportion of satisfactory batches 
are not rejected; he does not mind how many unsatisfactory batches are accepted. 
Likewise, it is in the interest of the consumer to ensure only that an undue 
proportion of unsatisfactory batches are not accepted; he does not mind how 
many satisfactory batches are rejected. The producer and consumer so described 
are notional; no producer or consumer in real life can define his interests so 
narrowly.” 

In practice the difficulty lies in fixing the two points on the OC-curve in a 
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rational manner and that among other things means taking the prior distri- 
bution and costs into account. Among the practical considerations in determining 
the risk points the following may be mentioned: What quality has the consumer 
actually got previously, i.e. what is the (prior) distribution of previously sub- 
mitted lots? What is normal market quality at the price the consumer is willing 
to pay, ie. what are the (prior) distributions for the suppliers of the market? 
What fraction defective can the consumer tolerate without giving him any 
essential trouble with the intended use of product and what fraction defective 
will be intolerable, i.e. how does the trouble (damage, loss) depend on number 
of defective items accepted? 

Besides these points other considerations, as for example the need for the 
goods in question, comes in. The conclusion is that even if we have a perfect 
solution to the purely statistical problem of determining a sampling plan corre- 
sponding to any given set of two points on an OC-curve the resulting plan must 
to a large extent be considered as arbitrary because we have no rational way of 
choosing the producers and consumers risks and risk points. 

Let us consider the statistical solution a little closer. Under normal conditions 
in practice the binomial distribution will be a sufficiently close approximation 
to the hypergeometric distribution in (1) for the determination of (n, c) from 
two given values of (p, P,). Denoting the producers risk point by p, and the 
corresponding risk by a, the consumers risk point by p. and the risk by 6 we 
find according to Peach and Littauer (3) 


[a-oOplie+Dal=F. - (2) 


and 


[(m — o)pe)/[(e + 1)q2] = Fi-s (3) 


where gq = 1 — p, P{F < F,} = a, and the degrees of freedom for F are 2(c + 1) 
and 2(n — c). The solution of these equations for a = 0.05 and 6 = 0.10 has 
been facilitated by Grubbs (4) who tabulated p, and p, as functions of (n, c) 
for c = 0(1)9 and n = 1(1)150. 

In most cases, however, the Poisson-distribution is a sufficiently good approxi- 
mation which gives the even simpler solution 


2np, = Xe (4) 
and 
2np2 = Xi-s (5) 
or in another form 
R = p2/p; = xi-8/Xa (6) 
and 
_— Xa/ 2p = xi-s/2p2 ’ (7) 


where the degrees of freedom for x” equal 2(c + 1). Tables to facilitate the 


solution of (6) and (7) have been given by Peach and Littauer (3), Cameron 
(5) and Horsnell (6). 
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From the fact that the hypergeometric distribution in these problems can be 
approximated by the binomial or the Poisson-distribution it follows that (n, c) 
are independent of N, i.e. to get a given discriminating power of a sampling 
plan the same (n, c) must be used for all N larger than 10n, say. This result has, 
however, sometimes been stressed too much as if it implies that the same sample 
size and acceptance number should be used irrespective of the size of the lot. 
Under most circumstances in practice this is an erroneous conclusion because 
it disregards the economic implications of the lot size. 

Usually it is most inconvenient for the inspector to be asked to specify the 
four quantities (p, , a) and (pz , 8). It is perhaps possible to get him to specify 
p, and pz on the basis of technological and economic considerations but then 
there still remains the choice of the two risks. This problem is analogous to the 
problem of choosing the risks for committing errors of the first and second kind 
in the testing of statistical hypotheses and it is “solved” in the same manner, 
namely by fixing the risks conventionally to 5% and 10%. If, however, for 
given values of (p; , pz) the risks are reduced from (0.05, 0.10) to (0.025, 0.05) 
or to (0.01, 0.05) the sample size will increase by about 50 and 80%, respectively. 

Instead of determining the sampling plan from two points on the OC-curve 
Hamaker (7) has proposed to use one point, the “point of control’’ corresponding 
to an acceptance probability of 50%, and the relative slope of the curve in this 
point. From a statistical point of view this mode of specifying the OC-curve is 
just as satisfactory as the other one, and further it has the advantage that 
(n, c) are easily determined from the two chosen parameters. From a practical 
point of view, however, this method has the same weaknesses as discussed above 
for the other method and further it is usually very difficult to get the inspector 
to understand the meaning of the second parameter, the relative slope. 

To help the inspector out of his troubles various systems of sampling plans 
have been developed in which the number of parameters left to the inspectors 
choice is limited to two easily comprehensible quantities. In constructing such 
systems the authors have to decide upon the values of certain parameters on 
the basis of their experiences and current conventions in the field. The advantage 
for the inspector of keeping to such a system is naturally that he gets a consistent 
and authoritative system which is easy to follow in an inspection department. 

Actually such systems are in widespread use and have proved to be of great 
practical value. The following critical remarks are of a theoretical nature and 
are not meant to detract from the usefulness of the systems. It may also be added 
that usually the authors of the systems are fully aware of the arbitrariness 
involved which can be seen from their introductory remarks to the tables. 


3. THE ACCEPTABLE QUALITY LEVEL 


A leading concept in some sampling inspection systems is the Acceptable 
Quality Level (AQL) which according to (8) is defined in the following way: 
“The maximum percent defective which can be considered satisfactory as a 
process average.’’ The purpose of the following remarks is to show that under 
some simple assumptions regarding costs it does not pay to inspect if the lots 
submitted for inspection are produced from a process which is in statistical control 
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with a process average p less than or equal to the AQL, 1.e. a binomial prior dis- 
tribution. 

From a theorem by Mood (9) it follows that if the prior distribution is binomial 
with parameter p then the number of defectives in the samples and in the re- 
maining part of the lots are independently and binomially distributed with 
the same parameter p. The average number of defectives in accepted lots will 
consequently be (NV — n)p, so that the only effect of inspection under these 
assumptions lies in the removal of the defectives found in the samples. Let us 
further assume that the damages (losses) caused by accepting a defective on 
the average is 1, ie. the loss from accepting a defective is used as unit for the 
other cost elements, and that the inspection cost per item inspected is k. The 
cost of 100% inspection then becomes Nk and the loss from (cost of) no inspec- 
tion becomes Np. Since the quality is assumed to be acceptable we must have 
p < k. The cost for accepted lots after sampling inspection is consequently on 
the average equal to nk + (N — n)p. For rejected lots we assume that the lots 
are fully inspected so that the cost becomes Nk. The average total cost in con- 
nection with sampling inspection becomes the weighted average of the two 
above mentioned costs for accepted and rejected lots respectively, which leads 
to 


nk + (N — n)[pP. + (1 — P,)). (8) 


This quantity is always larger than Np for p < k. 

The conclusion is that sampling inspection is more costly than acceptance 
without inspection, losses from accepted defectives taken into account. This is 
as it should be since the process is assumed to be acceptable. Thus the AQL- 
concept implies both the idea of a prior distribution and some cost considerations. 


4. THE SAMPLING INSPECTION TABLES OF THE STATISTICAL RESEARCH GROUP, 
CotumBi1A UNIVERSITY, THE MILITARY-STANDARD-105, AND THE 
PuHILips STANDARD SAMPLING SYSTEM 


Statistical Research Group (SRG) has based its tables (10) on the following 
assumptions: 


(1) The sample size is a certain function of the lot size (the sample size in- 
creases with the lot size but less than proportional). 
(2) The producers risk shall be 5%. 


About the first assumption the authors themselves write on p. 176 in (10): 
‘The relation between sample size and inspection lot size incorporated in Table 
1 is fairly arbitrary. About the best that can be said for it is that it is not un- 
reasonable and that it provides a consistent and systematic basis for selecting 
plans.”’ It is further explained that the discrimination, i.e. the slope of the 
OC-curve, primarily depends on n whereas the sampling inspection cost per item 
submitted depends on n/N. The function chosen gives a compromise between 
constant discrimination and constant sampling inspection cost. Thus the authors 
try to justify the general shape of their relation between N and n by means of 
some very vague cost considerations, taking only sampling inspection cost and 
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the “value” of the discrimination obtained into account. To make the system 
more generally applicable five “inspection levels” have been introduced. With 
level III as the “normal’’ amount of inspection, levels I and II require only 
about half and three-fourth as much inspection as III, whereas IV and V require 
one and one-half times as much and twice as much, respectively. 

With n as a given function of N (and the inspection level) and the producers 
risk fixed to 5% the acceptance number is determined as a function of the 
producers risk point. All the inspector has to do to use the SRG-tables is to 
choose the producers risk point and the inspection level. The tables then give 
the corresponding values of (n, c) conditioned by the two decisions made by 
the authors. As the OC-curves are published together with the plans and since 
the OC-curves are practically independent of N the SRG-tables may also be 
used with the OC-curves as starting point, i.e. finding a curve with required 
properties and then looking up the corresponding (n, c). 

The MIL-STD-105 (11) has taken over assumption (1) from the SRG (with 
only three inspection levels) whereas (2) has been replaced by the following 
assumption: the risk of rejecting lots of acceptable quality shall be small and 
shall (in the main) be a decreasing function of the lot size. This modification is 
presumably made in compliance with the intuitive feeling that large lots of 
acceptable quality should have a smaller risk of rejection than smaller lots of 
the same quality because of the greater economic losses incurred by wrongly 
rejecting a large lot. As the MIL-STD-105 like the SRG-tables contains all the 
OC-curves it may also be used “backwards.” 

In the Philips Standard Sampling System Hamaker et al. (12) have developed a 
system from similar principles as mentioned above with the modification that 
it is based on the point of control and the relative slope of the OC-curve in this 
point. The relative slope is made a function of both the lot size and the point 
of control. Hamaker writes: “(How the relative slope should be made to depend 
on lot size and point of control is a matter of experience. Precise rules cannot 
be laid down owing to the impossibility of determining the influencing economic 
factors, as discussed in I.” 

These few remarks on the basic assumptions of the three systems should 
suffice to show the difficulties in constructing a system of general applicability 
and also the arbitrariness of the existing systems. Perhaps the systems may best 
be viewed as a convenient way of indexing a collection of OC-curves. Only by a 
more detailed specification of the conditions and the purposes of the sampling 
inspection will it be possible to arrive at a satisfactory solution of the problem 
of constructing a sampling inspection system. 


5. Tue Dopgsr-Romic System 


In the classical paper by Dodge and Romig (1) the discussion is limited to 
non-destructive inspection and it is further specified that all rejected lots are 
to be completely inspected and all defectives found replaced by effectives. It is 
further obvious that the authors consider a situation with a prior distribution 
composed of a normal part, which is binomial, and a part of considerably poorer 
quality, without specifying either the form of the “abnormal” part or the relative 
weights of the two parts. 
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The one parameter in their system is the process average, p, in the binomial 
part of the prior distribution. The other is the consumers risk point which is 
called the lot tolerance per cent defective, p, , and is associated with a risk (an 
acceptance probability) of 10%. By a suitable choice of the lot tolerance per cent 
defective the consumer can protect himself against accepting the occasionally 
bad lots from the unspecified part of the prior distribution. 

The basic idea in the Dodge-Romig system is to choose the plan which mini- 
mises the average total inspection cost for normal production and at the same 
time gives the required consumer protection. Using the cost of inspecting an 
item as unit the cost function becomes equal to the average amount of inspection 
which may be written as 


=n+(N —n)\(l — P,). (9) 


Besides being a function of n and c, J is a function of p. Averaging over the 
binomial part of the prior distribution gives the same expression as (9) with p 
replaced by p. The optimum values of (n, c) are then determined as the values 
minimising J(p) under the restriction that P,(p,) = 0.10. 

Let us now consider the Dodge-Romig system within a somewhat wider 
framework of assumptions. Presumably they have chosen their rather vaguely 
specified prior distribution partly because information on prior distributions 
is scarce in practice and partly to keep the number of parameters down. Even 
if the prior distribution formally is represented in the system only by the one 
parameter p it is obvious that the prior distribution also influences the choice 
of p, . It must be assumed that the prior distribution extends on both sides of 
p. . Further the choice of p, must also depend on cost considerations which are 
not taken explicitly into account in the system. Instead of taking all cost elements 
into consideration Dodge and Romig limit themselves to inspection costs only 
presumably because these are the most easily accessible. Even if the concept of 
a loss resulting from the acceptance of defective items is not explicitly introduced 
it must be tacitly assumed that for lots of tolerance quality it would be cheaper 
to sort the whole lot than to accept the lot without inspection whereas for lots 
of process average quality the opposite is true, cf. the discussion of the AQL- 
concept in section 3. The cost function considered by Dodge and Romig is a 
monotone function of n and c and therefore (without any restrictions on n and c) 
leads to the conclusion that minimum inspection costs for product of normal 
quality are obtained by acceptance without inspection. To reach an optimum 
sampling plan which minimises the average total inspection cost for normal 
production Dodge and Romig therefore has to introduce an (arbitrary) relation 
between 7 and c which is achieved by means of the above-mentioned condition 
that the OC-curve shall pass through a given point. 


6. A Mopet Basep on Prior DistRIBUTION AND Costs 


The full significance of a sampling inspection plan can only be developed on 
the basis of the prior distribution and the economic consequences of rejection 
and acceptance. The basic question is: What happens to rejected lots and what 
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happens to defective items in either accepted or rejected lots? Some possibilities 
of what may happen to rejected lots are set out below. 


Rejected lots 


Bap fit tat et me ey peaelaiay sey 


Not sorted 
| | 


Effective items Defective items Scrapped Sold at 
reduced price 


Repaired or 
replaced by effectives 


i 
Scrapped Sold at 
reduced price 


It will be seen that there are many possibilities. ‘The Dodge-Romig case, for 
example, corresponds to sorting of rejected lots and replacement of defectives 
by effectives. If inspection is destructive, sorting is naturally out of question. 
Let us first consider the simple case where defective items found by inspection 
(etther sampling inspection or sorting) are replaced by effectives and the corre- 


sponding costs are considered as part of the manufacturing costs which are not 
incorporated in our model. Thus we limit ourselves in the first instance to con- 
sider inspection costs and decision losses only. 

Suppose that defective items in accepted lots cause some damage which is measur- 
able in economic terms. If the items considered are to be used in further pro- 
duction, say, the consumers loss by accepting a defective item may consist of 
the price paid per item (or costs of rework), costs of handling and identifying 
the defective item, plus costs of assembling and dis-assembling. Such costs 
should be rather easy to evaluate. If, however, the items represent finished 
goods and are inspected by the producer the loss by accepting a defective may 
involve service and replacement costs plus loss of good-will in the market which 
is difficult to measure. 

Thus, the loss may vary widely according to the situation envisaged but in 
any case we shall use the average loss caused by an accepted defective item as the 
economic unit for the evaluation of other cost elements. On this assumption we 
can find the cost function in the one limiting case: Acceptance without inspection. 
For a lot with fraction defective equal to p the total costs will be Np. 

_ We further assume that the costs associated with rejected lots (after sampling 

inspection) are proportional to N — n and denote these costs per item by k, . Thus, 
in the case of sorting of rejected lots k, is to be found as the sorting (inspection) 
costs per item divided by the costs of accepting a defective item. In the non- 
sorting case which includes destructive inspection k, is to be found as the 
manufacturing costs (or market price) per item divided by the costs of accepting 
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a defective item. (The manufacturing costs or market price may be reduced by 
the average value of rejected items). 

Let us consider another limiting case: Rejection without sampling inspection, 
i.e. rejection of every lot submitted. The total costs then become Nk, which 
represents the costs of sorting or manufacturing, respectively. The cost parameter 
k, thus defines a break-even quality in the sense that for lots of quality p < k, 
it is cheaper to accept without inspection than to reject whereas for p > k, the 
opposite is true. The last cost element we need is the costs of sampling inspection 
which is denoted by k, and evaluated as sampling and testing costs per item 
(in the sample) divided by the costs of accepting a defective item. 

In the sorting case we obviously have k, < k, . For destructive inspection we 
also have k, < k, since k, besides manufacturing costs also contains the costs of 
sampling and testing. If necessary we may also introduce a distinction between 
costs of sorting of rejected lots k, and costs of 100% inspection k, since costs of 
inspection per item inspected may be smaller when we have 100% inspection 
of all lots than when we inspect only lots rejected after sampling inspection. 

In sampling inspection the costs associated with lots of quality p = X/N 
will be composed of two terms: 


(a) the costs for accepted lots which are 


nk, +(X —-2z) for O<2<e, 
and 


(b) the costs for rejected lots which are 
nk, +(N—n)k, for c+1<2<n, 


where X and zx denote the number of defective items in the lot and in the sample 


respectively. Since the probability of getting x defectives in the sample from a 
lot containing X defectives is 


ptaix = 2) 2)/C) 10 


the average costs for lots of quality p = X/N become 


K(n, ¢,p) = nk, + D(X — a)ptzlX} + (WV — mk D plelX}. aD 


z=e+ 


Introducing the probability of acceptance P,(p) as defined by (1) and dividing 
by N to find the costs per item submitted we get 


Aone = pP.(p).+ k(1 — P.(p)) 


n 


+8 (t- i+ > (3 ~e (21x}) , (9) 


The main term of this expression is a weighted average of p and k, , the weight 
of p being a decreasing function of p. 

The cost function of sampling inspection, K(n, c, p)/N, is shown in fig. 1 
together with the cost functions of the two limiting cases: k(p) = p, correspond- 
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inspection 
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Sampling 
inspection 


k, Quality p 1 


Fia. 1—Cost functions per item submitted. 


ing to acceptance without inspection, and k(p) = k, , corresponding to rejection 
without sampling inspection which as a special case includes 100% inspection. 

As a third limiting case we consider the situation where we know the quality 
of each submitted lot without costs. We can then always make the right decision 
whether to accept or reject the lot, i.e. we obtain the minimum cost which is 
represented by the broken line: k(p) = p for p < k, and k(p) = k, for p > k, . 

The theory above defines the costs as function of p, the quality of submitted 
lots, in the four cases. To find the final costs we have to average the cost-functions 
with regard to the prior distribution, i.e. to evaluate four expressions of the form 


[ ko aon (13) 


where k(p) represents the cost-function and $y(p) denotes the cumulative prior 
distribution, i.e. the probability that the fraction defective of a submitted lot of N 
items is equal to or less than p. 


For acceptance without inspection we find the average costs as 


[ pdau@) = be , (14) 


i.e. equal to the average fraction defective of the prior distribution. For rejection 


without sampling inspection the average costs are equal to k, . For the minimum 
cost-function we have 


kn = |p dbulp) + [ke den() (15) 


which obviously is less than or equal to both jy and k, . 

For the sampling inspection case it is practical to introduce a few more con- 
cepts derived from the prior distribution. Let fy(X), X = 0,1, --- , N, denote 
the probability that a lot of N items contains X defective items and let 


(Np! 


Fy(Np) = x fw(X) (16) 
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denote the corresponding cumulative distribution so that 
by(p) = Fy(Np). (17) 


Then the simultaneous distribution of X defectives in the lot and x defectives in the 
sample is given by 


p{X, x} = fr(X)p{x|X} (18) 


from which we derive the marginal distribution of z, i.e. the over-all or average 
probability of x defectives in the sample, as 


g(t) = 2d pix, a} = 2X piz|X}fy(X) (19) 


and the corresponding cumulative distribution 


[np] 
G.(np) = z g(x) = ¥n(p). (20) 


Returning now to the cost function we get the final average costs per lot sub- 
mitted for inspection as 


Kin, =X K(n,0,%) (30 
nk, + > x (X — 2)p{X, xz} + (N — nk, + 2d pix, x} (21) 


= nk, + > gAx)E{X — xix} + (N — n)k, = Jn(x) 


where E{X — x | x} denotes the expected number cf defectives in the non- 
inspected part of the lot when x defectives have been found in the sample. 
Thus, the average costs consist of three terms: 


(1) The costs of sampling inspection, nk, . 

(2) The expected loss due to accepted defectives which equals the expected 
loss for a given number of defectives in the sample times the probability 
of that number for all numbers less than or equal to the acceptance 
number, i.e. 


X BUX — 2)|2} 9). 


(3) The costs of a rejected lot, (NV — n)k, , times the proportion of lots rejected, 


> Gn(2) « 


z=c+1 


Since (X — x)/(N — n) gives the fraction defective in the non-inspected part 
of the lot we may introduce 


Da(X) = E{(X — 2)/(N — n)|z} (22) 
which denotes the average fraction defective in the non-inspected part of the lots, 
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given that the fraction defective found in the samples is z/n. Dividing by N we find 
from (21) 


Kno = ¥ palaon(e) + DO keguta) (23) 
] one z=ct+l 2 


n 


+H \k, —-k,+ Dik. - plalo.is)} 
z=0 


For comparison with (15) it is, however, appropriate to change the variable of 
summation from x to p = 2x/n and at the same time introduce the cumulative 
distribution which leads to the expression 


K(n,¢) _ 


(c+4)/n 1 k q 
n d n Tr n 
y= [ palnp) dvato) + Fo va(p) 


(24) 
n 


(e+})/n 
+H (x, —k,+ [ [k. — p,(np)] av.(p)): 


The problem now is to determine the optimum plan, i.e. to find the values of n 
and ¢ which minimise K(n, c)/N for a given prior distribution and given value 
of k, , and to compare the corresponding minimum costs with jy and k, to find 
out whether sampling inspection is advantageous as compared to the two limit- 
ing cases. This problem will be solved in section 9 after the fundamental func- 
tions g,(x) and E(X — x | z) have been studied in sections 7 and 8. 

However, before leaving the model we want to discuss what properties a 
system of optimum sampling plans ought to have for large lots in relation to the 
prior distribution and the costs. Let us suppose that the prior distribution has 
a limit for N > ©, i.e. 


lim Sx(p) = 0). (25) 


The average minimum costs then converge to 
kr 1 

k= [| paaq + | k, aeq. (26) 
0 ke 


If all of the prior distribution is below k, , i.e.6(k,) = 1, all lots should be accepted 
without inspection and we have k = . Correspondingly, if $(k,) = 0 all lots 
should be rejected (or 100% inspected) and k = k, . Only if 0 < &(k,) < 1 we 
have to choose between acceptance and rejection. From an economic point of 
view the fraction ®(k,) of the prior distribution may be called the acceptable part 
of the prior distribution since for lots of quality p < k, it is cheaper to accept 
than to reject. In the following we shall assume that 0 < @(k,) < 1. 

The minimum cost situation which implies complete and free information about 
the quality of submitted lots may be considered as a limiting case of sampling in- 
spection. Let us assume that sample size tends to infinity with lot size in such a 
manner that n/N — 0. This means, however, that for large lots we know the 
quality completely (with probability 1) and that the costs of this information 
(which is proportional to n/N) are of no importance as compared to the costs 
of acceptance and the costs of rejection. Therefore, it is reasonable to require 
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that the average costs per submitied item in a system of optimum sampling plans 
should converge to k for N > @, ie. 


kr 1 
tn Ae = [| pas + | kaa. (27) 
No 0 kr 

Considering the limiting value of (24) we are first of all led to investigate the 
limit of y,(p). Since the hypergeometric distribution p{x | X} for fixed n, 
X/N = p, and N — © converges to the binomial distribution with parameters 
n and p, and since we have assumed that 6y(p) — ®(p) it follows that the com- 
pound hypergeometric distribution, g,(7), with weight function d®y(p), converges 
to a compound binomial distribution with weight function d®(p), i.e. 


tim gs(a) = | (")p'a"* de). (28) 


For n — ©, however, the binomial distribution converges to a one-point distri- 
bution, so that the compound binomial distribution tends to its weight function, 
i.e. ¥n(p) > P(p). 

For a rather wide class of prior distributions it may be shown that p,(x) ~ x/n 
for large N, see section 8. The intuitive explanation of this result is that when we 
consider lots having led to a large-sample fraction defective of x/n = p, say, 
then these lots themselves must also be of quality X/N = p (with probability 1) 
and consequently the fraction defective of the non-inspected part of the lots 
must be p since (X — x)/(N — n) = a/n + [N/(N — n)] [X/N — x/nl]. 

By means of the two results above it follows from a comparison of (24) and 
the limit required (27) that c/n must tend to k, for n — © apart from possible 
“deviations” caused by discreteness of ®(p). This means that the acceptance 
number must tend to infinity nearly proportional to the sample size as the lot 
size increases. The last term of (24) tends to zero since the ‘coefficient of n/N 
has an upper finite limit. As consequences we find that the over-all acceptance 
probability tends to the acceptable part of the prior distribution, i.e. 


(e+})/n 
GQ = [ dyn(p) > ®(k,), (29) 


that the average fraction defective of accepted lots tends to the average fraction 
defective in the acceptable part of the limiting prior distribution which is 


[ pasm/ee), (30) 


and further—in the case of sorting of rejected lots—that 
a 
B(409) > | p ae(p) 


and 
AOQL — k, 


apart from the modification necessitated by the discreteness of 6(p). 
The above results may be summarised in the following theorems. 





plans 


(27) 


e the 
dn, 
eters 
com- 
erges 


(28) 


istri- 
‘tion, 


»a/n 
n we 
say, 
ty 1) 
- lots 


and 
sible 
‘ance 
e lot 
n/N 
tance 


(29) 


ction 


(30) 


(31) 


(32) 


THE COMPOUND HYPERGEOMETRIC DISTRIBUTION 


. If y(k,) = 1 the optimum plan is acceptance without inspection. 

. If 6y(k,) = 0 the optimum plan is rejection without inspection. 

. If 0 < @y(k,) < 1 the optimum sampling inspection plan is determined by 
minimising K(n, c) and the minimum obtained has to be compared to 
py and k, to decide whether sampling inspection, acceptance without 
inspection, or rejection (total inspection) is to be preferred. 

. If N is large and 0 < @y(k,) < 1 sampling inspection is always preferable 
to acceptance without inspection and rejection without inspection. The 
minimum costs are 


kr 1 
k= [| pasq + [ kde) 
0 kr 
which represents a maximum saving in per cent of 
100 f° 
= | @- &) ae@) 
P Jk, 


as compared to acceptance without inspection and a saving of 


wf & - pase) 


as cempared to rejection without inspection. 


. For the optimum sampling plans n and c tend to infinity with N in such 
a way that n/N — 0 and c/n — k, if p has a continuous limiting distribution 
whereas c/n — h with h in the “neighbourhood”’ of k, in the discrete case. 


Before leaving the model we also want to indicate some simple but practical 
important extensions for which the same mathematical technique applies. 
In the model above—model I, say—we disregarded manufacturing costs and 
charged manufacturing with the costs of defective items found by either sampling 
or total inspection. In many cases, for instance if producer and consumer are 
two departments of the same firm, it is more natural to consider manufacturing 
costs, inspection costs and decision losses as a whole and minimise the total 
costs. To describe this situation we need one more cost-constant, k,, , which 
denotes manufacturing costs per item (or market price per item) as usual measured 
by the average loss caused by an accepted defective item. We assume further- 
more that the manufacturing department has to supply effective items instead 
of defectives when these are found by inspection and that this can be done with 
cost k,, per item (which is not quite correct but may be acceptable as an approxi- 
mation). 

The average costs per item submitted then becomes the following: 


(1) For acceptance without inspection: 
kn + Py = kn + knfiv + (1 — kn)Bw 
(2) For 100% inspection: 
kin + Kmpn + ke 


(3) For sampling inspection with total inspection of rejected lots: 
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K.(n, ¢) 
N 


= hen + han + Ek, + (1B) — by) galpate 


(33) 
~ ( - " Feld merle} bigge (ky) RD Kiln Kin, 0) 


z=c+1 
where K,(n, c)/N represents the cost-function for model I with k, and k, replaced 
by k,/(1 — k,,) and k,/(1 — k,.), respectively. Thus, the optimum sampling 
plan in model II is determined by exactly the same method as for model I. 

Model II may also be used to solve a slightly different problem. Suppose that 
two (or more) manufacturing processes are available, a cheap one with a “poor” 
prior distribution and a more expensive one with a better prior distribution. 
The problem then is whether to use the cheap manufacturing process and a 
relatively large amount of inspection or the expensive process and a smaller 
amount of inspection. This may be solved by determining the optimum plan 
for each of the processes in question and afterwards selecting the process having 
minimum total costs. 

Similar considerations may be useful for a buyer who has to choose between 
several suppliers, each supplier having his own market price (k,,) and a corre- 
sponding prior distribution. In such situations the problem is to strike the right 
balance between manufacturing costs (or market price) and inspection costs 
taking also losses from accepted defectives into account. 

Recently authors have constructed models similar to the two given here and 
tried to determine an optimum plan on this basis. In these attempts, however, 
the authors used special prior distributions or considered only special sampling 
plans. Satterthwaite, according to Hamaker (13), used a prior distribution 
composed of two fractions defective p, and p, occurring with relative frequencies 
w, and w, , respectively. This in a sense has been generalised by Barnard (14) 
who used a mixed binomial with two components. Sittig (15) used the function 
a(1 — p)*~* as prior distribution which is a special case of the Beta-distri- 
bution used by Champernowne (16) and James (17). Weibull (18) has given a 
detailed discussion of cost problems and proposed a solution based on the Dodge- 
Romig tables. Lastly, Horsnell (19) has discussed the above-mentioned prior 
distributions and several others in connection with a cost function which is 
essentially different from the one proposed here. A survey of these models has 
been given by Hamaker in (13) and (20). 

Here we are going to derive the two basic equations for the determination of 
n and c by minimising the cost function given above without making any assump- 
tions on the functional form of the prior distribution. After having obtained the 
general solution we limit the class of prior distributions to the class of repro- 
ducible distributions (for which the cost function becomes linear in NV) and study 
this solution in greater detail. 

Before deriving the equations for the optimum sampling plan, we study in 
section 7 the general properties of the two basic functions in the cost equation, 
namely g,(x), the distribution of the number of defectives in the sample, and 
p,(x), the expected fraction defective in the non-inspected part of the lots when 
the fraction defective in the samples is z/n. In section 8 we continue this study 
for the special case of reproducible prior distributions. 
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7. THE CompounD HypEerGEometnRic DisTRIBUTION 

From the prior distribution f,(X), i.e. the probability that a lot of N items 
contains X defective items, and the conditional hypergeometric distribution 
p{x | X}, ie. the probability that a sample of n items contains x defectives when 
the lot contains X defectives, we derived by multiplication the simultaneous 
distribution p{X, x}, see (18). Introducing as new variable y = X — 2, i.e. y 
denotes the number of defective items in the non-inspected part of the lot, we find 
the distribution 


pix, y} = fre + w(")(¥ * "\/(, . ) (34) 


from which we derive the marginal distribution of x as 


ata) = (") F tote +o" 7 *)/(, ). (38) 


This distribution may be called the compound hypergeometric distribution since 
it is produced by averaging the hypergeometric distribution for given z over all 
possible values of X according to the prior distribution. 

Even if the remarks about the compound hypergeometric distribution are 
phrased in the usual terminology of sampling inspection it is rather obvious that 
the theory presented in sections 7 and 8 is of a quite general nature and applies 
to sampling without replacement from dichotomized populations with a given 
distribution of the number of elements having the attributes considered. The 
following theorems may for example prove to be useful in developing a theory 
for analysing the variation of relative frequencies within a two stage sampling 
model analogous to the variance-component model for continuous variation. 
The theorems may also be considered useful from the point of view that a number 
of well-known distributions, for example the hypergeometric, the binomial, the Polya, 
and the rectangular distribution, may be considered as special cases of the compound 
hypergeometric distribution. 

For some purposes it is practical to write 


oe) = ("rata (36 


where the definition of h,(x) is obvious from (35) and (36). In the following we 


shall for convenience define (°) = 0 for b < a and also fora < 0. 


The moments of (x, y) may be expressed by the moments of X. First we seek 
the factorial moments for a given value of X. Defining 
a” = a(t — 1) --- (4 —r 4+ 1) 
we find the conditional expectation 
X\(N — X N 
rorya = Ear oe 9/C) 
ey |X} Le (x — 2) rl\n—2x n 
_* W-_™ xo yp p —-r- 7 ~ ae —r- _ 
n°" 2 es n— 2 =F 
n'(N ee n)‘” aia 
= Nor x' .. 
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Averaging over the prior distribution we find the (r, s) th factorial moment 


(r) or} tt n‘(N = n)“ 
y me nwt 


where im,,4.) denotes the (r + s)th factorial moment of X. By means of this 
result and the relation 


Bere) = Ela Mir+s) (37) 


Me = M2) — Ha (ua ra 1) 


the ordinary moments of the first and second order are easily found. Denoting 
the mean and variance of X by Nj and o we find pio = np, wo, = (N — n)jp, 
and further 


—1 N- ss 
mo aot [2b 4 N=" pa], (38) 


N-—-niN-—-n-1 


Hor = oy = > [pete at tye ‘|, 


N a 
Hu = Oxy = Wop [ox — Np@l. 


Similar results have previously been derived by Mood (9) who on the basis 
of (40) formulated the following theorem: 


The correlation coefficient for the number of defectives in the sample and the number 
of defectives in the remaining part of the lot is positive, zero, or negative according 
as the variance of the prior distribution is larger than, equal to, or less than the 
corresponding binomial variance Njp@. 


(40) 


Another form of the covariance is found directly from its definition since 


wu = E{x(X — 2)} — E{xjE{X — 2} 


n 
—_— Oz 


2 2 
Cg 


Hu = Or = oi(2 - i). (41) 


ox 
The variance of the number of defectives in the sample may also be written as 


N- 1 Ox 
ae [14+ gat se | 

This means that the variance equals the “hypergeometric variance” for the 
number of defectives in a sample from a lot of average quality times a factor 
larger than one. 

It is seen from these formulas that the ratio between the variance in the 
prior distribution and the variance in the corresponding binomial (prior) distri- 
bution is an important quantity for the characterization of the distribution of 
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(x, y). This leads back to the classical concept of distributions with normal 
(i.e. binomial), hypernormal, and subnormal dispersion. As remarked by Mood 
it is essential in the theory of sampling inspection to discriminate between the 
two cases where the prior distribution has hypernormal and subnormal dis- 
persion, respectively. Defining 5, by the equation 


ox = Npg(il+ dy), by >—I1, (42) 


so that the prior distribution is hypernormal for 6y > 0-and subnormal for dy < 0 
we find from (38) and (40) 
n 


o, = npa(1 + a iy) (43) 


a= 
C1 = ayy by . (44) 


It follows that the distribution of x (and also the distribution of y) is hypernormal 

or subnormal according as the prior distribution is hypernormal or subnormal. 

The coefficient of dispersion is, however, reduced from dy to 6y(n — 1)/(N — 1). 
Turning now to the conditional distribution of y for given x we have 


plylz} = plz, y}/gn(x) (45) 


and the corresponding factorial moments 


me(z) = Ely |x} = Do y'pta, y}/ga(a). (46) 


Evaluating the numerator we find 


Evian = EvC\ = worm. 


= s 
- wom EEE CTY ty wero d,) 


Se ECL Ter wtod ta) 


(x +1) 
(n + r)” Yn+r 


leading to the factorial moments 
weet) = (N — n) P(e +1) Qua le + )/(n + 1) 9, (2). (47) 


According to (21) and (22) we are particularly interested in the first moment. 
We have 


p(x) = w(x)/(N — n) = [(@+ Doane + DI/[@+ gn@)]. (48) 


For given prior distribution and given fraction defective z/n in the samples 
P»(x) gives the average fraction defective in the non-inspected part of the lots. 
It is therefore reasonable to use p,(x) as estimator of (the random variable) 


-— (N a n)*” 


(x + 1r) 
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(X — x)/(N — n) instead of the usual estimator z/n which is independent of 
the prior distribution. 

If only the first two moments of the prior distribution is known p,(x) may be 
approximated by the least-square linear regression 


palt) ~ 6 + Balx/n — p) (49) 
where 


8, = n by/[N — 1 + @@ — 1) by] (50) 
according to (43) and (44). 

We may also interpret p,(x) as the average conditional probability of getting a 
defective in the (n + 1)st drawing after having got x defectives in the first n drawings. 
For given X the conditional probability discussed is (X — x)/(N — n) = 
y/(N — n). Averaging over the prior distribution: leads to E(y | x)/(N — n), 
i.e. p,(2). 

By means of this probability we can derive a recursion formula for the com- 
pound hypergeometric distribution in the usual way. The “state” (n + 1, x) 
can only follow after one of the “states” (n, x) or (n, x — 1) by drawing an 
effective or a defective, respectively. This leads to the difference equation 


Gos(2) = ga(2)[L — palz)] + gale — Ip,(x — 1) 
= g(x) — [(e@ + 1I)/™ + VD] gnai(e + YD + [2/2 + DY] gn+1(z) 


which may be reduced to the fundamental recursion formula for the compound 
hypergeometric distribution 
g(t) = (m+ 1 — 2)/(n + I] grit) + [(@+D)/M+ Yigu(e+1), (1) 


where n < N — 1 and gy(x) = fy(x). 

This result may naturally also be derived from the corresponding formula 
for the hypergeometric distribution by averaging over the prior distribution. 
Denoting the hypergeometric distribution (10) for given X by ¢,(7) we have 


Ga+i(t) = v(x)[1 — (X — 2)/(N — n)) + oe — IX — 2+ 1/(N — 2). 


Multiplying this relation by fy(X) and summing over X leads to (51) after some 
reduction. Formula (51) includes several well-known formulas, e.g. for the 
binomial distribution the formula 


basi(x) = gb,(x) + pb,(e — 1). 


Formula (51) may also be written in the form 


A, gx(t) = 4 1° [rgn+1(2)] (52) 


where A,, and A, denote the forward difference operators with regard to n and z, 
respectively. This form is useful for determining the optimum sampling plan 
where we have to take differences of g,(x) with regard to n. 

Introducing the cumulative distribution 


G.l2) = Dos) (53) 
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we find by summation of (52) that 
4, G(x) = —[(e + 1)/@ + I) gnsi(@ + 1) = —g,(x)p,(2)- (54) 


To generalise this result we introduce the function h,(x) defined by (36). The 
recursion formula (51) then becomes 


ha(Z) = Rnvr(a) + Irosi(e + 1) (55) 


or 


A, h,(z) = —hasi(x + 1) 
and in general 


An h(x) = (—1)"has-(@ + 7). 
Writing 
S. gala) = 3 oso) = Gla) 
and 
S: gala) = +++ Ya) = 6G) 
formula (54) takes the form 


S. A, ("ray = (-0(")raula +1) 


which may be generalised to 


S2 A, ("rac - (-1)("ae +7). 


x 


The proof depends on the following lemma which is analogous to (52). 
| (Wreele +0) | = heated dn(™) + (™E2) staal + wl 
x x 2 
= [hnv(t +n) + An Ined(t + u)] dn (") ~ (") Adtasa(t + 1) 


(, ee 1 Vest + h) = ("YraraesCe + K + 1) 


—A, I(, e Veen + »|- 


Summing with regard to x we find 


5. a.| ("rete +0] = —(Yisennile +a + 0. (61) 


Repeated applications of this result, starting from (58), lead immediately to 
(59) which in terms of g,(x) may be written as 


AL GI(a) = (-D' Te FP +1) lone + 1) (62) 
or by means of (48) as 


A Gy” (2) = (—1)’ gu(x)pn(X)Pn+i(e + 1) ne Pase-( + —— 1). (63) 
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Combining (47) and (62) we get for the marginal moments of y 
Kor = (—1)'(N ors n)‘” A, Go*” (a) enn 


and a similar expression for 4;,).o) by substituting n for N — n. 


8. REPRODUCIBLE DISTRIBUTIONS 


As indicated in section 6 a special type of the compound hypergeometric distri- 
bution leads to a particularly simple solution of our problem. Consider a sequence 
of distributions fy(X), N = 1, 2, --- . If the compound hypergeometric distri- 
bution g,(x) as defined by (35) is equal to f,(x) then the prior distributions are said 
to be reproducible or invariant to hypergeometric sampling. To interpret this defini- 
tion we imagine a process capable of generating lots (finite populations) of 
any size N. From every finite lot of size N > n we draw a random sample of 
size n without replacement. The distribution of the samples will then be the 
same as the distribution of lots of size n generated of the process, i.e. inferences 
regarding the process may be drawn from the samples as if the samples were 
produced directly from the process and not via a finite population. As also 
shown in section 6 it is natural to require that the sequence fy(X) has a limit 
which is non-degenerate. 

Before discussing the general case we consider some special prior distributions. 
In this collection of examples we have also included the hypergeometric distri- 
bution itself even if a sequence of hypergeometric distributions can only be 
finite. The hypergeometric distribution (and the Polya distribution with negative 
coefficient of dispersion) has, however, similar properties as the other examples 
considered. 


8.1. The hypergeometric distribution. 


Consider the situation that we always have a stock of the same composition 
and size, namely M items containing A = M?@ defectives, and that lots of N 
items are selected at random from this stock. The prior distribution then becomes 


een Oc) OD. 


(yx) (() 


with mean N*™ and variance 


so that 


i.e. the distribution is subnormal. Inserting (64) into (34) and rearranging 
factors we get 


piz,y} = f(z; p, M)fv-.(y; 3. , M — n) (65) 





anging 


(65) 
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p. = $= - (9- )/(.- F)- 66) 


This means (1) that the distribution of z, g,(x), is a hypergeometric distribution 
with the same parameters as the prior distribution and (2) that the conditional 


distribution of y for given x, p{y | x} is also a hypergeometric distribution with 
parameters p, and M — n. 


It follows that p,(x) = p. , i.e. p,(x) is a decreasing function of z in accordance 
with the subnormality of the prior distribution. 


8.2. The binomial distribution. 


Let the lots submitted for inspection be produced by a process in control 
with process average equal to j so that 


‘ N\_x-w- 
fv(X;p) = (rar . (67) 
with mean N@, variance Ng, and 6y = 0. Inserting (67) into (34) leads to 


piz,y} = fax; p)fv-n(y; P) (68) 


which means that x and y are stochastically independent and that both variables 
are binomially distributed with the same parameter as in the prior distribution. 
Further, p,(z) = p. This result has been found by Mood (9). 


8.3. The Polya distribution. 


Consider a production process starting with the probability j of producing 
a defective and let the probability of producing a defective change during 
production of the lot so that after having produced (vy + yu) items of which » are 
defective the probability that the next item shall -be defective is 


ee a: 
we TG ae’ 


i.e. a linear function of » for given v + uy. 
This leads to the Polya distribution 


fw(X; B, 7”) = 


(") BO + 1) -+- G+ (X — DG + D+ G+ W-X- Wy) Oy 
x (1 +4) -+- (1+ — Dy) ) 


with mean N 7, variance 


1+ Ny 


ox = Npg ex 


a = (N — NTT: 
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The Polya distribution is thus hypernormal for y > 0 and subnormal for y < 0. 
Introducing (69) into (34) it is easily seen that 


pix, y} -_ f(x; D; Vfw_n(y; Dz ’ d) (70) 


where p, = (p + xry)/(1 + ny) and A = y/(1 + ny). It follows that the dis- 
tribution of x and the conditional distribution of y for given x both are Polya dis- 
tributions. 

The sample may thus be considered as generated by the same process as the 
lot, only that the production is limited to n items. The remaining part of the lot 
may be generated in a similar way, only that the probability of producing a 
defective starts by being equal to p, and the parameter characterising the 
change in the probability of producing a defective is X. It follows that p,(x) = 7p, , 
i.e. p,(x) is an increasing function of x in the hypernormal case and a decreasing 
function in the subnormal case. The Polya distribution may be considered as a 
generalisation of the two previous distributions. For y= 0 we get the binomial 
case and for y = — 1/M we get the hypergeometric case. Setting y = 1/2 and 
p = ¢ = 1/2 we find the rectangular distribution fy(X) = 1/(N + 1). 


8.4. The mixed binomial distribution. 
Consider first a mixed binomial distribution with m components 
fu(X; p: , wi) = dw w(¥ ptt, Dw = 1, 


with mean Np = N >> w,p; , variance 


ox =N Ae wWiPiqi + N° as wip; — Dp)’, 


Le wilps — py 
= (N — 1) Ba (73) 
This is a slight generalisation of the classical (Lexis) model of the hypernormal 
case which corresponds to equal weights, i.e. w; = 1/m. 

The mixed binomial distribution corresponds to the situation where m suppliers 
are producing the lots submitted for inspection, each supplier having his process 
in control with a given process average and supplier no. i delivering the fraction 
w; of the total number of lots. One may also imagine one supplier delivering all 
the lots, his process being in control during production of any lot but with a 
shifting process average from lot to lot. Substituting (71) in (34) and rearranging 
factors we find 


pix, y} = f(x; Di , Wi)fv-n(Y; Di , Wi(2X)) 
with 


z n-z 2 nz 


w(x) = wipigi */ a wiped: (75) 


which means that g,(x) is a mixed binomial distribution with the same parameters 
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as fy(X) and that also p{y | x} is a mixed binomial distribution with the same 
p’s as in fy(X) but with weights depending on z. It follows further that 


pz) = p wap: - (76) 


Summarising the above results we may formulate the following theorem: 


Let X denote the number of elements having a certain altribute in a population of 
N elements and let x and y = X — x denote the corresponding numbers of elements 
in a random sample (drawn without replacement) of size n and in the remainder 
of the population, respectively. If the distribution of X is a hypergeometric, a 
binomial, a rectangular, a Polya, or a mixed binomial distribution, or any weighted 
average of these distributions with weights independent of N and X, then for any 
N the distribution of x is the same as the distribution of X with n substituted for 
N, and the distribution of y for given x is also of the same type but with parameters 
depending on x and n. 


In the following we shall derive some properties of the mixed binomial distri- 
bution for use in later sections. 

To prove that p,(x) is an increasing function of x we notice that the inequality 
p,(t + 1) — p,(x) > 0 leads to the condition 


DL DX a:(piq; — pipiqi) > 0 

with 
Qj; = a;; = ww,(p.p;)"(q.9q;)"** > 0. 

Since 

ya a app (p; — pi) = 0 
and 

LY Lawi>= XO Lao; 
as a consequence of Schwarz’s inequality the condition will always be satisfied, 
i.e. p,(x) is increasing. A completely analogous reasoning shows that p,(x) is a 
decreasing function of n. 


Generalising the weight-function to include continuous functions, i.e. considering 
the process average as a continuous random variable, we find 


N\ f° * 
fv(X) = (*) [ pq" “w(p) dp. (77) 
Choosing w(p) as a Beta-distribution 


(p) = T(s + @) o-1 ¢-1 


=Tero” % ° o>e, tad, 


_(N\Te+X)Tt@+N-X) etd 
fx(X) = (7) T(s) T(t) ris +t+ N) 


ee Pd YS. AIP Le Raeae BR “$n 
OE SE PESEEL E RIGS PLIES ESS ENE ERIS IREE AS HR HS 


ee ha 
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which is identical to the Polya distribution (69) for pj = s/(s + #) andy = 
i/(s + #), see also Lundberg (21). Writing the mixed binomial distribution in 
general as 


X) = (Y) [ v*a"* awe) 


ee (Npol ” 1 [Npo} N 
AX <pb= 5 won = [S (Xpte* ave 
| ~ [ awe, - 9/004 awe), 


® denoting the standardised normal distribution. For N — © the integrand 
tends to 1 for p < po and to 0 for p > pp so that the cumulative mixed binomial 
distribution tends to the cumulative weight function. 

To investigate the limit of p,(x) from (76) for n — © and a/n — h we write 


wo -[+ ER GIG) T 


and study first the limit of a single term of the denominator. We have immediately 
that 


n™ log [(p:/p.)*(9:/9:)""*] = (x/n) log (p;/p:) + (1 — 2/n) log (q;/9:) 
— h log (p;/p;) + (1 — h) log (9;/9:) = ¢i:(h) (79) 
For ¢;;(h) = 0 we find 
h = hi; = log (9;/q;)/log (p.9;/p;9:). (80) 


Since the coefficient of h in ¢,;(h) is positive for p; > p; and negative for p; < p; 
it follows that the sign of y;;(h) varies as shown in the following table. 


Sign of ¢;;(h). 
i <D; i> Dj 
h ae h,; bed 
h> hi + 
Consequently 


(pi/ps)"(qi/9)"~ ~ exp [ng;;(h)] > " for 9; ;(h) > 0 


0 fory,;(h) < 0. 
We shail now prove that h;; lies between p; and p; . For p; < p; , say, we have 
his — ps = —[log (p.9;)/(pi9:)) “[—log (ai/9:) + ps log (p.9:/p:9:)] 


Since the first factor is always positive the sign of h;; — p; must be the same as 
the sign of the second factor which may be written as 


f(p:) = p; log (p:/p;) + 9: log (q:/4:), O0<p<D;. 


(81) 
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Since f(p;) is a decreasing function and f(p;) = 0 it follows that h;; — p; > 0. 
Similarly it may be shown that h;; — p; < 0. For n — @ and x/n > h we 
therefore have 


w(x) ~ [1 ot a = exp ine) | is \ for all ¢;;(h) <0 
; ist ‘ 0 tae 


Assuming p, < p2. < --- < p» we find that 


1 for he-r,s < h < Rg .ear 
w(x) 
0. all other values of h. 


(82) 


where ho, = O and Aa imai = 1. 

The final result is that p,(x) tends to a step-function taking on the values p, , 
D2, °** » Pm, the jumps taking place in the points 0, his , hes , *** 5 hm-iym » This 
result is also reasonable from the point of view of estimating the relative fre- 
quency of defectives in the remainder of a lot using p,(z) as estimator. By means 
of an observed relative frequency, x/n ~ h, say, in the (large) sample we choose 
among the m possible values of p by means of p,(z), i.e. for h;-1,, << 2/n < hy yeas 
we get p,; as the expected relative frequency in the remainder of the lot. For 
m = 2 we have two jumps, namely for h = 0 and h = h,, . In the special case 
where the weight function is a Beta-distribution p,(x) becomes a linear function 
of x. 


8.5. The general case. 


After having studied some special cases of reproducible distributions the 
question arises: What is the general form of a reproducible distribution having a 
given non-degenerate limit? This question was answered by Brgns (22) who is 
going to publish a proof elsewhere. In the following we shall derive the answer 
to this question by making use of the previously derived moment relations. 
Denoting the rth factorial moment of fy(X) by m,,)(N) it follows from the 
reproducibility that the rth factorial moment of g,(zx) is m;,,)(n). Inserting this 
into (37) we find 


(ry 
qa m,,)(N) forall N >n, 
which leads to the fundamental relation 


™,,)(N) = Na, for N a r 


Mr)(n) = 


where 
a, = m)(r)/r! ° 


To express fy(X) by the factorial moments we differentiate the generating 
function for the factorial moments 


Y + Ow) = DF mW) 
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X times with regard to ¢ and put ¢ = — 1 which gives 


tu) = 35 2 SP meren(N). 


v! 


Inserting (83) we get 
i) = (¥)Sc—(* 5 aes. 


Introducing the limiting distribution 
[Np] 
&(p) = lim Sy(p) = lim >? fx(X) 
Noo N-o X=0 
we find 


N yr) 1 (r) 1 
yor ineX) = [ SP dente) > [ v' da) 


X=0 
1 r—1 
—_— eee Dp —_—_ —_——_— : 
= ; N 


ae ah 
N 


and consequently 


1 
[ p d®(p) = a, . 


Combining (84) and (85) we find 
u(x) = (¥) 5 —0(% 5 *) [at caw 


= (¥) [ta - ni" oq. 


We have thus proved the following theorem: 


The class of reproducible distributions with a given non-degenerate limiting distri- 
bution consists of the sequence of mixed binomial distributions with the given limiting 
distribution as weight function. 


Also in the general case we therefore have the property observed in the special 
cases, namely that the distribution of y for given x is a mixed binomial distribution 
with a weight function ®,(p) depending on x given by 


d®,(p) = Se. : (87) 
[ pq’ * d&(p) 


The result (86) may also be derived by introducing 


ona) = foe) = (")rsa) 
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into (35) which gives 


h(x) = >> (™Yinsate +y) for m>0 


y=0 


{(1 + E,)"E bios 1}h, (x) _ 0, m = 0 


as the conditions for reproducibility. Solving the partial difference equation by 
means of Boole’s method we find 


h,(x) = (—1)* Ay h,_.(0) 


f(t) = (-1(") Az f,-2(0) 


which has the mixed binomial distribution as a particular solution since 
Ag" * = —pq"~*. The requirement to /,(x) forn — « then determines the weight 
function and at the same time shows that the particular solution also is the 
complete solution since all factorial moments of the arbitrary function have to 
be zero. 

To develop asymptotic formulae for the mixed binomial distribution we con- 
sider separately the cases where the weight function is continuous and discrete. 
Let w(p) be a continuous density function with m parameters and 


1.0) = | (")prar*w(o) ap. 


_ It seems reasonable to approximate f,,(x) by a continuous function w,(p) so that 


(x+4)/n 


f.(a) wp) dp = : w,() (88) 


(z-4)/n n 


where w,(p) is the same type of distribution as w(p), only that the parameters 
are functions of n, and the moments of f,(x) and w,(p) are identical apart from 
the factor n’, i.e. 


n 


> 2'f.(z) =n’ [ ' p'w,(p) dp. 


z=0 
Since 


n 


Yah) = DAP D2 4 
- out 


we have the following equations for the determination of the moments and con- 
sequently also the m parameters of w,(p): 


1 r 
[ p'w,(p) dp = = - aro(")o, » vw, 2,--- sm. (89) 
0 


vy=0 v 
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The cumulative distribution functions ©,(p) and W,(p) have thus the same moments 
for all n and for n — & both distributions tend to ®(p). 


For n — © and x/n — h we therefore have 


fala) ~ + wi(h). 


In the discontinuous case we have 


f(z) = x w.(" pia. 
For n > © and x/n — h we find by means of Stirling’s formula that 
-} 2 
(rr-~[ome(o-3)) 
x n n \z 


~ [2nmnh(1 — h)]"* exp [—n{h log (h/p) + (1 — h) log (1 — h)/q}]. (91) 
Assuming that p, < p. < -+- < p,, and that p; < h < pis, we have 


( i-1 
[2rnh(1 — h)}*f,(z) se: ws} + x (w;/w;) exp ine} 


(ay 


ae a 


-exp [—n{h log (h/p,) + (1 — h) log (1 — h)/qi}] 
+ wie + = (w;/W5+1) exp imei 0} 


-exp [—n{h log [h/(p;+1)] + (1 — h) log [p] (1 — h)/(ai+1)}] 


where ¢;;(h) is given by (79). From the properties of y;;(h) previously discussed 
it follows that the dominating term for p; < h < h;,;4, becomes 


ful) ~ w,{2mnh(1 — h)]“* exp [—n{h log (h/p,) + (1 — h) log (1 — h)/qs}] (92) 


so that n’’*f,(x) tends exponentially to zero. For h = h;,;+: the two exponents 
are equal so that 


f(x) ~ (Ww; + Wi41)[2enh(1 — h)y* 
-exp [—n{h log (h/p,) + (1 — h) log [(1 — h)/q:}}]. (98) 


At last we consider the least-square linear regression (49) for the class of repro- 
ducible distributions. Expressing the coefficient of dispersion by means of a, 
and a, we find 


by = (N — 1)05/pG (94) 
where of = a, — e and j = a, . From (50) we then get 
6, = [1 — (63 — p9)/ne3]"* > 1 for n> &. 
Introducing this expression for 8, into (49) we obtain 
pr(x) = (x — pr)/(n — 2) (95) 


where \ = (o3 — pq)/o; . Thus, p,(x) ~ x/n for large n as postulated in section 6. 
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8.6. General remarks on prior distributions. 


It is rather difficult to specify the properties of the prior distribution in 
general terms. It seems realistic, however, to require that the prior distribution 
should depend on N in such a way that the variance increases with N at least 
proportional to N’. For the mixed binomial distribution we have 


V{X} = NIN — lo, + Nog 
so that the variance of X/N is a decreasing function of N tending to 
2 i, 2 
= | @— 2p) dQ). 
0 
For the application of the mixed binomial distribution it is therefore essential 
to discuss whether it is realistic to assume that the formation of lots is such that 


the variance of the fraction defective decreases with lot size as specified above. 
The Dodge-Romig prior distribution may be written as 


ju(X) = w(N)pat-* + weo(X) 


where w, and g(X) are unspecified (w. = 1 — w,). Denoting the means of the 
two parts by Np, and Np, , respectively, we get 


2 
= wNagn + wo, + N?* dL wip; — p)” 


where o; denotes the variance of the second part and p = w,p, + wep, . In 
trying to specify this distribution completely it might be tempting to choose 
¢(X) as a one-point distribution which leads to o> = 0 and thus simplify the 
variance. The corresponding g,(x) is, however, not convenient. It is more satis- 
factory as Barnard has done to specify g(X) as another binomial component. 
This does not introduce more parameters than the one-point distribution, it is 
more realistic, and g,(2) becomes a mixed binomial with two components which 
is easy to handle numerically. The mixed binomial with two components is, 
however, not completely satisfactory but may well serve as a starting point for 
many investigations. Its advantages are that it only contains three easily under- 
standable parameters and that it covers both skew and two-peaked distributions. 
As N — o~, however, the two peaks get more and more pronounced and this is 
not desirable. 

Sittig, Champernowne, and James have used the Beta-distribution as a prior 
distribution in combination with a binomial distribution. It follows from the 
theory developed above that it is more reasonable to use a Polya-distribution 
as a prior distribution and combine that with a hypergeometric distribution in 
that way obtaining the same “sample distribution” as the authors mentioned 
above without using any approximations. Most of the prior distributions pro- 
posed by previous authors are members of the class of mixed binomial distri- 
butions. 

For the further development of the theory of sampling inspection it is an 
important problem on the basis of experience from process control and sampling 
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inspection to specify properties of the sequence of prior distributions fy(X). Un- 
fortunately, published data on prior distributions are very scarce. It is important 
to notice that considerations regarding fy(X) also should be based on models 
of the production process itself so that the theories of process control and lot- 
by-lot sampling inspection in this way may be linked closer together than at 
present. 

The mixed binomial distribution may be generated by a process where each 
lot of size N is produced in a state of statistical (binomial) control but where the 
process average varies from lot to lot in accordance with the distribution func- 
tion ®(p). Until better models have been developed the mixed binomial distri- 
bution may well serve as an approximation, and probably a rather good one for 
the purpose of developing sampling plans, since the compound hypergeometric 
distribution converges to a mixed binomial distribution if the sequence of prior 
distributions converges. 


9. Tae Optimum SAMPLING PLAN 
The cost function given by (23) may be written as 


K(n, 6) = nk, — k,) + Nk, + (Vn) D (la) — koala). (06) 


This cost should be compared with the cost of acceptance without inspection, 
N>@, and with the cost of rejection, Nk, . The condition for sampling inspection 
to be the cheaper solution thus becomes 


(N — n) D> (k — pale))ga(2) > nk, — ke) + Nb, - 8) (97) 


where ec = 1 for p < k, and e = Ofor p > k, .. The left hand side of (97) depends 
on the loss of rejection, (V — n)k, , minus the average loss of acceptance, 
(N — n)p,(x), for x defectives in the sample. Obviously, the average of these 
differences for accepted lots has to be positive and as large as possible, see fig. 2. 


Acceptance loss 


Rejection loss 


0 
0 c x Number of defectives in sample 


Fie. 2—Decision losses as functions of the number of defectives in the sample. 
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The plan shown in fig. 2 is not an optimum plan since it includes some values 
of x for which k, — p,(x) is negative. It is intuitively clear from the graph that 
a maximum of the left hand side of (97) for given n is obtained for p,(c) < 
k, < pa(e + 1), i.e. for the value of «c where the curve representing (V — n)p,(zx) 
and the horizontal line representing (V — n)k, intersect. In fig. 2 the graph of 
p,(z) has been drawn as an increasing function of x. From section 8 we know, 
however, that p,(x) may for some prior distributions be decreasing. In such 
cases it is clear that the usual decision rule has to be reversed to get an admissible 
plan, i.e. lots should be accepted when the sample contains a large number of 
defectives and rejected for a small number of defectives. This has also been 
observed by Mood (9) on the basis of his theorem on the correlation coefficient 
for (x, y). As such prior distributions probably are without any practical import- 
ance we shall discuss only the usual decision rule in the following. 

We now consider the cost-function (96) as a function of n and c for given values 
of k, , k, , and N, and for a given prior distribution, and determine the optimum 


sampling plan by minimising K(n, c). In this discussion it is useful to introduce 
the weighted average of p,(x) for x < ¢, i.e. 


pal) = pale) oe)/ Xone) (98) 


/ 


which may be interpreted as the average fraction defective in the non-inspected 
part of accepted lots. 


By means of (54) and (57) the cost-function (96) may be written as 
K(n, ¢-) = nk, + (N — n)[k, — y(n, 0] (99) 
= nlk, — k, + y(n, ©] + N[k, — ym, 9] 
where 
y(n, c) = kG." + A. G20). (100) 


For convenience G{” (x) will also be denoted by G,(x) as in section 7. 
The differences of y(n, c) with respect to cand n are 


A. y(n, ¢) = k,g,(e + 1) + 4G, + 1) 
gnc + 1)(k, -" Prle + 1), 


A, y(n, 0) = k, AGO + BE?O 


1 2 


= gn(C)Pr(O)[Prrile + 1) — k,] (102) 
according to (62) and (63). 
For given n the difference of K with respect to c is 
A. K(n,c) = (N — n)g.(c + 1)(p,(¢ + 1) — &,). (103) 
This shows that K(n, c) has one and only one minimum with respect to c for any 
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given n provided p,(x) is an increasing function of x which will be assumed in the 
following. The value of c minimising K is obtained from the inequality 


A. K(n,c — 1) < 0 < A, K(n, 0) (104) 
which gives the fundamental inequality 
plc) Sk, <pet+l), OFec<n-—-1. (105) 


This equation simply tells that for any n the minimum cost is obtained for the 
value ofc for which the expected loss due to accepted defective items is “nearly 
equal” to the rejection cost, see also fig. 2. 

If p,(0) > k, the minimum value of K is to be found for ¢ = 0 and if p,(n) < k, 
the minimum value is found for c = n. Solving the equation p,(c) = k, leads to 
c = F,(n) so that (105) may be interpreted as an inequality of the following form 


Fim) —-1<c< F,(n), (106) 


i.e. c is the largest integer less than or equal to F,(n), see fig. 3. 

Since both n and c are integers the inequalities (105) or (106) must be in- 
terpreted as defining a path in the (n, c)-plane as depicted by the step-function 
in fig. 3. This means that for n§ < n < n{ , n being an integer, c equals 0, for 
ni <n < nic equals 1, etc. 


From (54) and (100) we find 
y(n, c) = k,G,© — > Pa(%) Jn(X) 


= (k, — pr(Q)G.@)- (107) 


It then follows from (105) that y(n, c) is positive for an optimum sampling plan. 
Using (48) we may also write the first fundamental inequality (105) in the form 


c +1) gnii(e + 1) ¢ +2) gasi(e + 2) 
(: + t) gn(C) sh < (- “ 2) gle + 1) (108) 


This indicates that for large n we must expect that (c + 1)/(n + 1) ~ k, , at 
least if g,(z) has a continuous limit. 
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Fic. 3—Relation between n and c. 
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Considering c as a function of n defined by (106) we may regard K(n, c) asa 
function of n alone and determine the values of n for which A,K(n, c) shifts from 
a negative to a positive value. This procedure must, however, lead to the same 
result as studying A,K(n, c) for every c over the whole range of n and later impos- 
ing the restriction (106) on the solution. For given ¢ the difference of K with 
respect to 7 is 


A, K(n,¢) = k, — k, + ym, ce) — (N —n — 1) A, y(n, ¢) 

=k, —k, + y(n,¢) + (N —n — 1)g,p.Olke — Pile + 1]. (109) 
The following discussion rests on the assumption that K(n, c) has one and only 
one minimum with respect to n for any given c. The condition for this assump- 
tion being fulfilled has not been expressed in simple form but it is conjectured 


that the condition depends on p,(x) being a decreasing function of n for any zx. 
The value of n minimising K is obtained from the inequality 


A, Kau — 1,c¢9 <0 <A, K(n,o), e+1<n<N-1, 

which leads to 
(VN —n— 1) A, y(n, 0c) — y(n, 0) 

<k,—k, < (N —n) Aya — 1,0 —-y¥a— 1,0) (110) 
or 
(N —n—1) Ayn, 0c) <k, —k, +yn,e-) < (N —n +1) Avym@ — 1,0). (111) 
Solving (110) with respect to n leads to 

F.e-) -—1<n< FLO. (112) 


Combining (106) and (112) we get the solution sketched in fig. 4. The optimum 
value of (n, c) is situated within the hatched area. If the curves intersect in such a 
manner that several values of (n, c) are within this area, i.e. several pairs of 
integers satisfy the two inequalities, it becomes necessary to determine the 
absolute minimum by insertion into K(n, c). Defining 


Fan,c) =n +14 [k, — k, + y(n, 0]/4,. (Mm, ©) (113) 
(110) may be written as 
Fn-—1,0 <N < F(n,c). (114) 


if A, y(n, c) > 0. 
Denoting the minimum value of K(n, c) by Ky we may eliminate k, — k, + 
y(n, c) from (111) by means of (99) which gives 


(1 pie ¥ ‘e A,y(n,c) < Re (k, — ym,¢)) < (1 7 ‘> ee 


where 


k, — y(n, c) = p,)G,(c) + kl — G,(0). (116) 





neFic)-1 n=F,(c) 
c*Fin) 


c=Fin)-1 


~ 6 8 10 12 14 16 18 20 22 2% of 


Fie. 4—Determination of the optimum value of (n,c). 


For A, y(n, c) > we find 


Fe > G0) + kl — G0). (117) 


It is therefore necessary for a sampling inspection plan to be cheaper than 
acceptance without inspection that the above lower limit to K,/N be less than #. 
The above results may be summarised as follows: 


The optimum sampling plan corresponding to a given prior distribution, given 
lot size, and cost parameters are to be found by solving the two inequalities (105) 
and (110) with respect to (n, c). If more than one solution exist the optimum value 
is found by choosing the one which makes (99) a minimum. A lower limit to the 
minimum costs is given by (117). 


The “sorting effect’’ of sampling inspection may be evaluated by comparing 
the prior distribution fy(X) and the distribution of accepted lots 


fw) = & pluie} o.e/G.0, (118) 


cf. (45), assuming that defective items found in the sample are replaced by good 
ones. Particularly we find the average fraction defective in accepted lots as 


Hy Lua) = (1 = n/N) D pala)ote)/Gl0) = (1 — n/NYPLO). (119) 


Introducing the assumption that rejected lots are to be screened and all 
defective items replaced by good ones as in the Dodge-Romig average-quality- 
protection-system it follows that the expected value of the average fraction 
defective after sampling inspection and screening becomes 


E(A0Q) = (1 — n/N)p,(0)G,(0). (120) 


It is important to notice that the inequality (105) only depends on N through 
the function g,(x) and that N enters linearly into the inequality (110) apart 
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from its influence through g,(x). Therefore, zf g,(x) is independent of N the whole 
problem is much simplified. 

In the following we shall discuss in greater detail the determination of optimum 
sampling plans assuming thai the prior distribution is a mixed binomial. In this case 
g.(x) is independent of N and (105) gives a relation between n and c which must 
be valid for all lot sizes. 

In developing a system of sampling plans covering all lot sizes we may therefore 
begin with the tabulation of the function c = F,(n) or the inverse function 
n = Fy *(c) which means that for a given value of c the inequality (105) will be 
satisfied by the following values of n 


Fr'© <n < Fi’ +1). (121) 


Since Fj'(c) is an increasing function of c the inequality (121) leads to a division 
of all nm > F;'(0) into non-overlapping consecutive intervals corresponding to 
successive values of c. Defining n, as the smallest integer satisfying (121) we 
may also write 


Ne < n < Ne+1 ° (122) 


The function n = F,(c), however, depends on N and moves to the right as NV 
increases, see fig. 4, thus generating successively the optimum plans correspond- 
ing to increasing values of N. Instead of solving (110) for n, considering N as 
given, it is much easier to solve for N as shown in (114) where F(n, c) now is 
independent of NV. 

For a given c we insert successively the “permissible” values of n, given by 
(122), into F(n — 1, c). Since F(n — 1, c), is an increasing function of n for 
n, <n < N,4, the inequality (114) leads to a division of all N > F(n, — 1, c) 
into non-overlapping consecutive intervals corresponding to successive values 
of n. The last interval is open to the right since A,y(n, c) < 0 forn = n.., — 1 
which may be seen from (102) combined with (105). For each c we thus get a 
table of the following form: 


c Ne Fn, —1,0 <N < Fn, ,o) 
nn, +1 Fin, ,0 < N < Fu, + 1,0 


c Neri — 1 Fins — 2,270 <N< @ 


Values of c, n, and N on the same line in this table satisfy the two fundamental 
inequalities (105) and (110), ie. the table gives optimum combinations of n 
and N for a given c. To find the absolute minimum we have to compare such 
tables for various values of c. If two plans, (n, , c,) and (ne , C2) say, have over- 
lapping N-intervals then (n, , c, , N) and (nz , cz , N) for all N in the intersection 
of the intervals satisfy (105) and (110). To find the optimum solution we have 
to compare the costs. 

According to (99) the cost function is a linear function of N for given n and c. 
For given c we have A,y(n, c) > 0 for the n-values considered. Thus, for given c 
the cost function is a piecewise linear function of N with decreasing slopes. 





Fic. 5—Costs as function of lot size for given acceptance numbers. 


To compare two plans with overlapping N-intervals we therefore only have 
to determine the point of intersection between the cost functions. For values of N 
smaller than the corresponding abscissa the plan with the steeper slope’ is to be 
preferred. The N-value in question is determined from the formula 


n = Ge = kde — mi) + novltte C2) — mi(tr , C1) 
(Ne ’ C2) ee y(n ’ C1) 


The combinations of n, c, and N giving minimum costs are therefore to be 
found by comparing the linear sections of the cost functions as sketched in fig. 5 
for three successive values of c, the specially marked N-interval designating the 
interval with c as optimum acceptance number. 

For practical purposes it is not necessary to make a detailed tabulation of 
the optimum plans, particularly since the minimum of the cost function in most 
cases is rather broad. A satisfactory system of nearly optimum’ plans may in 
many cases be found by replacing the inequality (105) by the equality 
prc + 1/2) = k, or by using 3(n, + n.4,) as the only n-value corresponding to 
c and then determine the corresponding N-intervals from (123). 

In the following we shall investigate how c and n increase with N for large N. 
However, we do not have an expression for the optimum (n, c) but only two 
inequalities delimiting the optimum value. As N — o the area within which the 
solution is to be found also tends to infinity, see fig. 4. We might therefore study 
the points of intersection between the four curves and in this way get limits for 
the solution. Instead of using this procedure we have, however, chosen a point 
“near the center’ of the hatched area in fig. 4 and studied the asymptotic 
behaviour of this point. This has been achieved by replacing the inequalities (105) 
and (114) which may be written in the form (a,/b,)< k< (a,/b.) by equalities 
of the form k = (a, + a,)/(b: + 62). We first treat the case where p,(x) is linear 
or approximately linear and the weight function is continuous. This covers at least 
the Polya and the rectangular distribution. Using (95) for p,(x) we get from (105) 


k, = [ec + 3 — paj/(n — 2) (124) 


(123) 
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which shows that c/n — k, for n — ©. From (114) we find 
N-n-—}3 
= [2(k, — k,) + ym — 1,0 + ym, e]/[4. vm — 1,¢ + Ayn, 0). 
Using (102) and (124) we have 
A, y(n, ¢) = gnPal( — k,)/(n + 1 — 2) 
and 
A, y(n — 1,¢) = Gn-1(C)Pn-1©)3/(n — 2). 


For n — © we then get by means of (90) 
kr 
N-n~ nh - k, + [ (k, — p)w(p) ap |/ec — k,w(k,)/2 (126) 


We have thus proved the important theorem: 


For prior distributions with a continuous limit and a linear or nearly linear re- 
gression function the sample size increases proportional to N‘”” for large N. 


Furthermore, it is rather easy to discuss the influence of k, , k, and w(p) on the 
relation between n and N by means of the coefficient given in (126). 
In the discontinuous case we denote the binomial distribution by 


b(x;n, p) = ("\pro- 


and the cumulative distribution by 


Be;n,p) = (" . ae 
z=0 \W 
We then have 
y(n, ¢) = a wBe; n, pi)(k, — pi) 
and 
A, y(n, ¢) = a w,b(c; n, p:)p(p; — k,). 
The equality corresponding to (105) becomes 
a w,[b(c;n, p:) + be + 1;n, p)](p; — k,) = 0 
which may be reduced to 
a wb; n, pL + (n — e)p./( + lagi: — k,) = 0. 
From (114) we get 
N-n-—- 4 c= 
2k, — k,) + 2) w[Ble;n — 1, p,) + BE; n, p)\(k, — ps) 
dX w[be;n — 1, p) + be; n, p)Ipp; — ky) 
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The denominator may be written as 
LX, wible;n, p)[L + (n — &)/ngdpcps — k,) 
and by means of (129) reduced to 
x wb; n, p)(pi — kip: — € + 1)/n). (131) 


If p; < k, < pis; it follows from (105), (76), and (82) that c/n — h; 41 , cf. 
(80). According to (91) we further have 


b(c; n, pi) ~ be; n, piss) ~ [2enh(L — hy] te? (132) 
where h = h;,;4,; and 
B, = h log [h/p,) + (1 — h) log [(1 — h)/q.] 
= h log [h/(pi+s)] + (1 — A) log (1 — A)/gias]. (133) 
A reasoning analogous to the one leading to (93) shows that the denominator 
tends to 


[2mnh(1 — h)}*e?* - w,(p; — k,)(p; — h) 


which leads to the final result 


N -n~|k, —k,+ > wk, aa p» | 
j=l 


i+1 
-[amnh(t — nye / E D wilps — k.)(p; | (134) 
which may be expressed in the following theorem: 


For prior distributions with a discontinuous limit the sample size increases pro- 
portional to log N for large N. 


We have thus found that there exists an essential difference in the relation 
between lot size and sample size in the two cases. The continuous case requires 
a larger sample size than the discrete case for large lots. The method employed 
in deriving formulae (126) and (134) does not secure that the first factor on the 
coefficients to n? and n'/7e"", respectively, are correct. Numerical results seem to 
indicate a systematic error in the coefficient in (134). 

In the following three sections we shall derive the optimum sampling plans 


for the most important prior distributions of the mixed binomial type and give 
some numerical examples. 


10. THE RECTANGULAR DIsTRIBUTION AS PRIOR DISTRIBUTION 


The simplest example of the application of the above theory is found for 
fv(X) = 1/(N + 1). Admittedly, this is not a realistic choice of prior distri- 
bution, but the purpose of the present section is also more to demonstrate the 


working of the theory in an algebraically simple case than to set up a realistic 
model. The main results are 
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K(n, c) = nk, + (N — n)k, — (N — nk, + 1)/m + 1) 
+ 3(N — n)(e + 2)/(n + 2) 
= (k, —k, + y(n, o))n + (kh, —ym,0))N, (185) 
A. y(n, c) = [k, — (€ + 2)/2@ + 2)}[€ + )/m + I), (136) 
A. K(n,c) = [((N — n)/m + I][—k, + (¢ + 2)/m + 2)], (137) 


and 
N+1c+1 c+2 1 (¢ +2)” 

A RR Re B+ Viet t( ,- 2t4) 2(n +2)” 
In the following an explicit solution of the minimisation problem is given for 
k, = k, which somewhat simplifies (105). The cost function is a second-degree 
polynomial in c for any n < N. From (137) it follows that for n < 2(1 — k,)/k, 
the minimum is to be found for c = 0, whereas for 2(1 — k,)/k, <n < N we 
get the first fundamental inequality as 


kn+2)-2<c<km+2)-—1 (139) 
which determines the value of ¢ giving minimum of K for the given n. For 
n = N the cost function is equal to Nk, . From (138) it follows similarly that for 


c > 2k,(N + 3) — 2 the minimum costs are obtained for n = N, whereas for 
smaller values of c we get the second fundamental inequality as 


(N + 1)(¢ +2 — 2k) (N_+ 1) + 2 — 2k,) 

W+ ik +¥e4+2) 1<"S@4Ie+ie+y 
The two fundamental inequalities thus correspond to a pair of straight lines 
and a pair of hyperbolas, respectively. 

To illustrate the theory further K(n, c) has been tabulated for N = 30 and 
k,= k, = 1/4 in table 1. Values of K corresponding to values of (n, c) satisfying 
one of the inequalities (139) and (140) are shown in Italic type, and bold type 
denotes that both inequalities are satisfied. It will be seen that three pairs of 
values of c and n satisfy both inequalities, namely (n, c) = (5, 0), (8, 1), and 
(11, 2). The corresponding values of K(n,c) are 7.05, 7.01, and 7.04 so that the 
optimum sampling plan is obtained for (n, c) = (8, 1). 

A good approximation to the optimum plan may in the present case be found 
by solving the two equalities corresponding to the midpoints of (139) and (140). 
This leads to the following results: 


c= {(N + 1)k,(1 — k,) + [(1 + 3k,)/4]?}! — 2 + (1 + 8h,)/4 (141) 
and 


n= {((N + 1)(1 — k,)/k,] + [(1 + 3k,)7/4k,]}* — 2 — (1 — 3k,)/4 (142) 


Thus, c and n are nearly proportional to N’”* and c/n > k, as N > o. Inserting 
(141) in the cost function (135) we find 


K/N = k, — [1 — n/N][2k,(n + 2) — 1)°/[(2)(4@ + D@ + 2)] 
~k, — [1 — n/N][k, — 1/2 + 2)]7/2 (143) 
—k [1 —k,/2] for No 
which is identical with the limiting value of K)/N. 
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For large N sampling inspection will consequently always be cheaper than 
both total inspection and acceptance without inspection if k, = k, since 


[<k, for 0<k, <4 
l<4 for $<k <1- 


For N = 30and k, = 1/4 we get c = 0.89, n = 7.55, and K = N-0.2337 = 7.01 
which correspond closely to the optimum plan 


Tabulation of the optimum plans for all N may be carried out in the following 
way. Solving (139) for n leads to the inequality 


—2+ (c+ 1)/k, <n < —2+ + 2)/k, (144) 
which gives limits for n for every c. Similarly (140) leads to limits for N 
—1 + [anc + 2))/le + 2 — k(n + 2)] 
<N<-14+ 34+ DeE+2))/e+2—km+3)] (145) 


for every c and all n satisfying (144). Comparing plans with successive values 
of c and overlapping N-intervals to find the optimum plan we have to determine 
the point of intersection between the two cost functions from (123). A good 
approximation may be derived from the equalities corresponding to the mid- 


TABLE 2 
Values of n and N computed from (148) and (149). 


N 


76-106 
107-166 
167-348 
349- 


103-142 
143-222 
223-462 
463- 


134-184 
185-286 
287-592 
593- 


169-232 
233-358 
359-738 
739- 


208-284 
285-438 
439-900 
901- 
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TABLE 3 
Determination of N from (123). 


° 
~ 


(m1 , C1) 


S 
we 


Ns (M2 , Ca) 


0.0166667 
0.0166667 
0.0178571 
0.0208333 
0.0222222 
0.0222222 
0.0222222 
0.0247253 
0.0247253 
0.0261438 
0.0261438 
0.0270563 
0.0270563 
0.0276923 
0.0276923 
0.0281609 
0.0281609 
0.0285205 
0.0285205 
0.0288051 


0.0178571 
0.0208333 
0.0208333 
0.0227273 
0.0227273 
0.0240385 
0.0247253 
0.0257353 
0.0261438 
0.0267857 
0.0270563 
0.0275000 
0.0276923 
0.0280172 
0.0281609 
0.0284091 
0.0285205 
0.0287162 
0.0288051 
0.0290360 


DINAMAAnrk Ph wOwWNNHK KKK OSS 
OCWMWOWDNNAOUNUAKhHEWWONNNN HE RK 


points of (139) and (140) which lead to 


and 


N. = [k.n2 + 4(1 + 5k,)n, + 2k, — 3/4)/(1 — k,) (147) 


i.e. n is a linear function of c, and N is a quadratic function of n. The relation 
(147) shows that asymptotically N ~ k,n?/(1 — k,) which also follows from 
(126). The corresponding cost function is given by (143). It follows that for 
the optimum plans G,(c) — k, and n/N ~ (1 — k,)/k,n—- 0as N — o. Further, 
the average fraction defective in accepted lots becomes 


#(1 — n/N) + 2)/(m + 2) — $k, 


and—in the case of sorting of rejected lots—E(AOQ) = (1 — n/N)[(ce + 2) /2 
(n + 2)°] — k?/2 according to (118) and (120). For k, = k, = 1/4 we find 


4¢4+2<n<4c+ 6 (148) 
and 
—1 + [2n( + 2)]/[4ce + 6 — 1) 
<N < -14+ [2274+ DE + 2))/[4ce + 5 — n). (149) 


Table 2 contains the values of n and N corresponding toc = 0, 1, --- , 9. 
For the comparison of two plans with overlapping N-intervals it is important 
to remember that y(n, c) is an increasing function of both n and c. Writing 
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TABLE 4 
The optimum sampling plans. 


a 


G,(c) K.o/N 


0.333 0.250 
0.250 0.245 
0.200 0.238 
0.250 0.236 
0.222 0.233 
0.231 0.230 
0.235 0.228 
0.238 0.226 
0.240 0.225 
0.241 0.224 
0.242 0.224 
0.243 0.223 


86-134 
135-194 
195-264 
265-345 
346-436 
437-539 


coosceoeceoseoor 
— i em DD WO 
SSSSESLSSSsS 
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S 
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K(n, c) = a(n, c) + B(n, c)N it follows that a(n, c) is an increasing and 8(n, c) a 
decreasing function of both n and c. 

Table 3 shows the comparison of plans with overlapping N-intervals by 
determining the points of intersection according to (123). For y(n, c) we have 
y(n,c) = [C+ 1)/4 +1) [1 — 24+ 2)/m+ 2)}. 

The results of tables 2 and 3 lead to the optimum plans in table 4. Besides 
the sampling plans also the acceptance probability, G,(c) = (¢ + 1)/(n+ 1), 
and the cost, K/N, corresponding to a lot size equal to the midpoint of the 
N-interval, have been given. Further, NV, has been calculated from (147) and it 
is seen that N, practically equals NV. 

It is seen from table 4 that G,(c) fairly rapidly tends to k,= 0.25, that n/N 
tends to zero (nearly as (1 — k,)/k,n), and that K,/N tends to k, — k?/2 = 
7/32 = 0.219, the maximum saving by sampling inspection as compared to 





Fic. 6—Relation between N and n from table 4. 
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total inspection being 12.4%. It is worth noting that n/N is a misleading measuer 
of the advantageousness of sampling inspection as compared to total inspection. 
Apart from small values of c the optimum plans are obtained by keeping 
(c + 3/2)/(n + 2) constant and equal to k, which means that the acceptance 
probabilities (on the OC-curves) for p = k, are a little below 50%. For large 
lots the optimum plans will accept nearly all lots with a fraction defective less 
than k, = 0.25 and reject nearly all lots with a higher fraction defective, the 
overall acceptance probability thus being 0.25. 


Fig. 6 shows that the relation between log N and log n is nearly linear. The 
dashed line represents the asymptotic relation (126). 


11. Tue Potya DistrisuTtIon As Prior Distripution 
With the Polya distribution 


_(N\TC+OTt+tN-X) e+ 
fu(X) = 3) r© Ti) Te+i+™ 


- (Jmeiysiam 


as prior distribution the inequality (105) becomes 
—-s—t+(s+o/k,<n<—-s—t+(s+et+ /k,. 
From (100) and (102) we find 


y(n, 0) = < t) 


DO ("\i.BG + 2,t+n—2) —Be+et1,ttn—2) 


z=0 


and 


1 
A. v(0, 0) = Beep (") 
(Bs+te+t2,t+n-—c) —kBet+et+1titn—o] 
leading to 
F(njc) =n+1 


(k,—k,)B(s,t)+ 2 (")i.Bet2,t+n—2) —B(s+2+1,t+n—2)] 


(154) 
(")Be+e+2,4+n—< —k,B(s+e+1,i+n—c)] 


from which the optimum plans may be determined in the usual way. 

Since y(n, c) may be tedious to compute it may be worth while to develop an 
approximation by means of the Beta-distribution. Putting p = X/N we have 
from the Polya distribution that 


M{p} = p=s/(s+ 
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and 


Vip} = (1 — p++ d)/Ni/e+t+). 
Using a Beta-distribution 


“a 1 —-1+an es —-1+BN 
wy(p) = Blex, Bw? (1 — p) 


as approximation to the (Polya) distribution of p and equating the mean and 
the variance of the two distributions we find ay = s(N — 1)/(N + s+ 2) and 
= UN — 1)/(N + s + 20). It follows that 


G0) + Iilon » Bn), hb = [e+ 1/2]/(), (155) 


where J as usual denotes the ratio of the incomplete and the complete Beta 
functions. If the tables of J do not suffice, they may be supplemented by the 
tables of the binomial distribution since 


a+B—-1 ss 
Ia, 6) = Bla; + B —_ i h) = a “gig + 8 ea nil —_e 
for integer values of a and 8. The second term of y(n, c) may be written as 
- s+2 ce ed n ec x 
Re ae <5 te + 


Approximating the last sum by the integral 


[ pw.) dp = [s/s + D1lx(an + 1, 8.) 


we finally get 
y(n, c) ~ kien » Br) 
— B{[(s + D/(e + t+ ni, , 8.) + (n/(s + t+ n)]Til@n + 1, B.)}. (156) 
From these results it follows for N — © that h — k, , G,(c) — I;,(s, 4), 
yn, 0) > k,1,,(s, ) — pl.(s + 1,4, (157) 


and the lower limit of the cost per item tending to k,(l — J,,(s, )) + 
pl,.(s + 1, ¢). Further we find that the average fraction defective in accepted 
lots is approximately equal to 


8 + t n T(an + z B,) = 
(1 - en. +it+tn t; +t+ :) Ti(@n , Bn) > Bh, + 1, O/In(8, 9, 


and, in the case of sorting of rejected lots, that E(AOQ) — pl, (s + 1, 2). 
The asymptotic formula (126) gives 


— k, + k,I,,(s, t) — pl,(s + 1, t) n?. 
— k(1 — k,)'/2B(s, 2) 


A case of special interest and simplicity occurs for s = 1 since the limit-distri- 
bution then becomes w(p) = é(1 — p)‘', ¢ > 1, and the Polya distribution 


(158) 
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may be written as 


wa = OE VE = eG) (eo?) a9 


which makes computations particularly easy if also ¢ is an integer. 
From (126) we find 


N-—n~nlk, —p+ a1 — kV, — &)'/2], =p = 1/(¢ + 1). (160) 


If the prior distribution is of this special form then the system of optimum 
sampling plans is determined completely from the two parameters ¢ and k, . The 
parameter ¢ may be replaced by # or by 


ke 
P{k,} = | w(@) dp = 1-1 - 
0 
giving the acceptable fraction of the limiting prior distribution. 


TABLE 5 
Values of n and N computed from (161) and (163). 


3 


a(n, c) N a(n, ©) N 


02666667 1-4 06054856 192-246 
03809524 5-11 06082889 247-340 
04285714 06102616 341-526 
04444444 06114970 527-1087 
04 06120776 1088- 


04848485 06157232 448-687 


05090909 06166778 688-1406 
05221445 
05274725 06200038 563-862 


06207636 863-1762 
05274726 


05439560 06234494 693-1059 
05546218 06240685 1060-2158 


05607843 06262827 836-1274 
05634675 06267969 1275-2595 


05634674 06286537 992-1510 


05724114 06290875 1511-3071 
05784158 


05820083 06306670 1161-1767 
05836250 06310380 1768-3587 


05836250 06323980 1344-2042 
05892368 06327188 2043-4143 


same 06339021 1540-2339 
06341823 2340-4738 


05965222 
06352212 1749-2654 


05965222 
06003707 06354681 2655-5374 


06030510 
06047124 
06054856 


1 
2 
3 
4 
5 
6 
7 
8 
9 





2-246 
i-340 
1-526 
17-1087 
i 


8-687 
8-1406 


3-862 
33-1762 


33-1059 
10-2158 


36-1274 
15-2595 


92-1510 
11-3071 


61-1767 
68-3587 


44-2042 
43-4143 


40-2339 
40-4738 


49-2654 
555-5374 
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In the general case we have a system depending on three parameters, s, ¢, and 
k, , where s and ¢ may be replaced of = s/(s + #) and o} = pg/(s + ¢+ 1), 
or by pj and P{k,} = I,, (s, é). For practical purposes it may be better to specify 
the limiting prior distribution in terms of p and P{k,} than by means of s and ?. 

In the following we shall first in a simple example show how to construct a 
system of optimum sampling plans by means of the theory developed above. 
(A collection of optimum sampling plans based on the Polya distribution will 
be published in a forthcoming paper.) Next we shall use the theory to construct 
an optimum plan corresponding to a given prior distribution. 

The system computed is based on the following values of the parameters: 
s=1lt=4,k, = k, = 0.2, 0.3, and 0.4. This means that the limiting prior 
distribution is w(p) = 4(1 — p)* with p = 0.2, o, = 0.16, and P{k,} = 0.59, 
0.76, and 0.87, respectively. From (151), (152), and (154) we find 


—-5+ (e+ 1)/k, <n < -5+ €+4+ 2)/k, , (161) 
vind = BO ()/PT9)- ECCT. con 


and 
F(n,c) =n +1+ y(n, 0/A, y(n, ©). (163) 


Table 5 shows for k, = 0.2 andc = 0,1, --- , 15, the corresponding values of 
n according to (161), y(n, c) computed from (162), and the N-intervals according 
to (163). For c > 7 it is easy to see that we only need the third and fourth 


TABLE 6 
Determination of N from (123). 


v(m , C1) 


At 
we 


Ne (na » Ca) 


03809524 
.04285714 
.042857 14 
.05090909 
.05221445 
.05221445 
.05546218 
-05546218 
.05607843 
-05784158 
. 05820083 
-05954416 
. 06047124 
.06114970 
.0616677782 . 0620003794 
.0620763603 . 0623449380 
.0624068495 .0626282687 
-0626796878 -0628653677 
-0629087531 .0630667012 
.0631037992 -0632397968 
.0632718820 -0633902079 
.0634182318 -0635221204 


-04444444 
.04848485 
.05090909 
.05274726 
.05439560 
.05546218 
.05724114 
.05784158 
.05784158 
05930881 
-05930881 
.06030510 
-06102616 
.06157232 
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TABLE 7 
Optimum Sampling Plans. 


a 


N 100G,(c) 100n/N K/N ~ Saving % 


80.0 50.0 . 1867 
66.7 . 1728 
57.1 , . 1643 
. 1587 
. 1547 
. 1520 
. 1499 
. 1483 
. 1469 
. 1458 
. 1449 
. 1442 
. 1435 
. 1429 
. 1424 
. 1419 
.1415 
1412 
. 1408 
. 1405 
. 1403 
. 1400 
. 1398 
. 1396 
. 1394 
. 1392 
. 1390 
. 1389 
. 1388 
. 1386 
. 1385 
. 1384 


5-11 
12-28 
29-47 
48-76 
77-102 

103-144 
145-179 
180-232 
233-275 
276-340 
341-390 
391-467 
468-526 
527-615 
616-687 
688-783 
784-862 
863-971 
972-1059 
1060-1179 
1180-1274 
1275-1407 
1408-1510 
1511-1655 
1656-1767 
1768-1923 
1924-2042 
2043-2210 
2211-2339 
2340-2518 
2519-2654 


oa 

— 

oo 
Ls SS el 
aoarknontwoa 


bo 
WOAH AWDOSCAAWNAD 


OCOWWMONINAASTINRRWWNNRK RK OCS 


anuaanagagnagn Aaa 
SSRSRSSSHSHSKSHSSSRSBSRSVIEN 
DOW OW IPN PN EPMA AMAANCNEWDWONNONG 
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CONN PR RAONDSCONWANHENNIDWDABDKHANBDS 


ow oo wry tw 
SSSSSSSSSSSESSSSS: 
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a 
© 


59.0 0.0 . 1345 32.8 


N-interval to find the optimym plan and the tabulation has therefore been 
. 6 shows the comparison of plans with over- 
\V tabulated is the nearest higher integer of 


to 


It will be seen that G,(c) ~ 0.59 for nearly all N’Wfor each N-interval the geo- 
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Fic. 7—Relation between lot size and sample size iron table 7. 


metrical mean N of the interval-limits has been computed. According to (160) 
we have N — n ~ 0.4 n’ which means that n/N — 0 approximately as 2.5/n for 
large n. The minimum costs, K,/N, tends to 


kr 0.2 
gu [ (k, — p)w(p) dp = 0.2 — [ (0.2 — p)4(1 — p)* dp = 0.1345 
0 0 


which means that the maximum saving becomes 100 (0.2-0.1345)/0.2 = 32.8%. 
‘It will be seen from the table that already for lot size 100 the saving amounts 


to about 25%. Fig. 7 shows that the relation between log N and log n is nearly 
linear. The dashed line represents the asymptotic relation N ~ 0.4 n’. The 
results of similar computations for k, = 0.3 and 0.4 have been given in table 8 
and fig. 8 which clearly show how the sample size decreases with increasing k, 
for given lot size. 

To illustrate the foregoing theory by an example we consider the problem 
discussed by Kjer (23) regarding control of the fraction defective in carloads 
of used bottles returned to a brewery from local dealers. In our terminology 
k, = 0.05 and N = 5000. On the basis of total inspection in a previous period 
the (grouped) prior distribution given in table 9 was obtained. The mean is 
pb = 0.0193 and the variance o} = 0.00010343, using Sheppard’s correction. 
Assuming now that the prior distribution is a Polya distribution we estimate s 
and ¢ by the method of moments giving s = 3.646 and ¢ = 185.266. With such 
values of s and ¢ the Polya distribution is tedious to handle numerically and we 
have therefore in the following computations used the Beta distribution as 
approximation and furthermore we have obtained the values of the incomplete 
Beta functions byl inear interpolations in a table of the binomial distribution 
(24), interpolating with regard to all three arguments. The distribution com- 
puted from the values of s and ¢ given above has been given in table 9 for com- 
parison with the observed distribution. 
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TABLE 8 
Optimum sampling plans. Polya distribution with s = 1 and t 


k, = 0.2 Ek, = 0.3 


° 
= 


j 1-171 

72-242 

# 243-335 
336-507 
508-650 11 
651-880 14 
881-1070 16 
1071-1357 19 
1358-1594 21 
1595-1939 24 10 
1940-2222 26 11 
2293-2624 29 12 


* 
~ 


om oO 
SOON MI rPWNNH KH OO 


RSISSase 


1108 
1228-1441 33 10 1330 
1442-1698 37 11 1565 
1699-2004 40 12 1845 
688-783 
784-862 
863-971 
972-1059 47 
1060-1179 48 
1180-1274 52 
1275-1407 53 
1408-1510 57 
1511-1655 58 
: 1656-1767 62 
' 1768-1923 63 
1924-2042 67 
2043-2210 68 
2211-2339 72 
2340-2518 73 
2519-2654 77 


*)n = 0: accept 
without inspection. 


OSCWOWMHNNAHOAIAIRhRWWNHNHH KH OOS 


Since the “break-even” quality is 5% and nearly the whole prior distribution 
is below this value it is fairly obvious that the cheapest solution is acceptance 
without inspection. If we nevertheless consider a sampling plan we have to 
choose one satisfying the first inequality which amounts to 20 c — 116 < n < 
20 c — 96. 

Putting c= 13, say, and choosing n as the midpoint of the corresponding interval 
we find n = 154. This leads to a, = 1.627, 8, = 82.662, h = 0.09123, I,(a,, 8.) = 
0.9978, Is(a, + 1, B,) = 0.9898, y(n, c) = 0.03070, and K/N = 0.05000 — 
[0.9692] [0.03070] = 0.02025, which is larger than 0.0193, i.e. the sampling plan 
considered is more costly than acceptance without inspection. It is easy to see 
that all combinations of (n, c) satisfying the first inequality lead to values of 
a, and 8, with I-values close to 1 so that +(n, c) approximately becomes equal 





bution 
ptance 
ave to 
ca < 


iterval 


g plan 
to see 


lues of 
3 equal 


THE COMPOUND HYPERGEOMETRIC DISTRIBUTION 








5000 10000 N 


Fie. 8—Relation between lot size and sample size from table 8. 


to k, — p and consequently K/N =~ p + n(k, — p)/N = 0.0193 + 0.0307n/N 
which has its minimum 0.0193 for n = 0 and its maximum 0.0500 for n = N. 
Kjer (23) only compared the costs of sampling inspection with the costs of 
total inspection for the plan (n, c) = (150, 7), using the acceptance probabilities 
and the empirical prior distribution to evaluate the average cost for the plan 


chosen. Also he did not take the number of defectives found in the sample into 
account. 

For the sake of illustration we shall show how the computation of the average 
costs may be carried out on the basis of an empirical prior distribution. At the 
same time the two plans (n, c) = (150, 7) and (150, 13) have been compared, 
the first being the one proposed by Kjer, the second being the one suggested 
by the first inequality for n ~ 150. 


TABLE 9 
Observed and calculated prior distribution. 


Percentage 
defective Prior distribution 
100 p. Observed Calculated 


One © 
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From (12) it follows that the average cost per item for lots of quality p = X/N 
is 


K(@)/N = k, + (p — (1 = n/N)E)PAD) — 5 D evlelX) 


P.p) = D ptziX} 


denotes the acceptance probability. In the present cases P,(p) has been evaluated 
approximately by means of the binomial distribution and the last term by means 
of the Poisson distribution. 

Denoting the empirical prior distribution by f(p) the average cost is found as 


K/N = XL K@f(~)/N 


as shown in table 10. For k, = 0.05, n = 150, and N = 5000 we have 


ec 


K(p)/N =~ 0.05 + (p — 0.0485)P.(p) — (5000)™* 2d a[(np)*/x"e~””, 

the last term being denoted by d in table 10. The plan (150, 13) is slightly 
cheaper than the plan (150, 7). The average costs found from the empirical 
distribution is the same as that previously found from the Polya distribution. 
For the given 7 it is obvious that K is rather insensitive to changes in c from 
c = 13 and upwards, since c = n = 150 leads to P,(p) = 1 for all p and K = 
0.0202. 

It is not surprising that acceptance without inspection is the cheapest solution 
when we have a homogeneous, one-peaked prior distribution with only a few per 
cent of the lots of poorer quality than the break-even quality. As a further 


TABLE 10 
Computation of average costs for two sampling plans. 


Percentage Prior 
defective distribution (n, c) = (150, 7) (n, c) = (150, 13) 
100 p S(p) 100 p — 4.850 100 P, 100d 100 Pa 100d 
1 4 —4.600 100.0 0.008 100.0 0.008 
33 —3.850 100.0 0.030 100.0 0.030 
42 —2.850 98.9 0.058 100.0 0.066 
13 —1.850 91.7 0.075 100.0 0.090 
‘5 —0.850 74.7 0.073 99.7 0.119 
2 +0.150 52.3 0.057 98.1 0.144 
1 +1.150 31.7 0.037 93.2 0.158 


Ch 
oo S 


100 


Average cost 
Average acceptance probability 
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example we therefore consider the same prior distribution in relation to 
k, = 0.025 so that about 20% of the lots submitted are of poorer quality. 

Replacing again the empirical prior distribution with the corresponding Polya 
distribution we get the relation 

n = 40c — 23 

from the first inequality by choosing the midpoint of the n-interval. To determine 
the optimum combination of n and c for N = 5000 we could use the second 
inequality and then tabulate K(n, c) to find the minimum among the possible 
solutions. In the present case it is, however, easier to tabulate a few values of 


K(n, c) since it is obvious that the optimum value of n cannot be very large. The 
results of such a tabulation are given in table 11. 


TABLE 11 
The average costs per item for k, = 0.025, N = 5000, and n = 40, c — 23. 


Saving in per cent 
n 100 n/N G,(c) K/N a 


57 1.14 0.871 0.0190 
177 3.54 0.811 0.0186 
217 4.34 0.811 0.0185 
257 5.14 0.799 0.0185 
297 5.94 0.788 0.0186 
377 7.54 0.774 0.0187 
457 9.14 0.765 0.0187 


The optimum plan is obtained for n = 217 and ¢ = 6 which gives minimum 
costs of 0.0185, 4% less than the costs of acceptance without inspection (a) and 
26% less than the costs of total inspection (b). The minimum is very flat and 
the saving as compared to acceptance without inspection only smail even if 
18.9% of the lot are rejected. 

To show the mechanism of the whole system and at the same time show that 
the result obtained from the Polya distribution is practically equal to the result 
obtained from the empirical distribution a detailed calculation has been given 
in table 12. 


TABLE 12 


Calculation of the average costs per item for 
k, = 0.025, N = 5000, n = 217, andc = 6. 


(= Dh ine ae ma. 


100.0 0.347 + 
99.3 1.074 33 
85.3 2.103 42 
2. 2.771 13 

3. 2.851 5 

8. 2.702 2 

2. 2.580 1 


100 - Average costs = 1.835 
100 - Average acceptance probability = 80.8 
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The cost function K(p)/N is nearly equal to p for small values of p and nearly 
equal to k, for large values of p. The average costs becomes 0.0184 which is 
practically equal to the result found in table 11. Even if the saving is not con- 
siderable sampling inspection may be advantageous for other reasons, for example 
on account of its psychological effect and for keeping the prior distribution 
under control. These problems will, however, not be taken up in the present 


paper. 


12. THe Mrxep Binom1At DIstRIBUTION AS 
Prior DistRIBUTION 


Writing the mixed binomial distribution as 
' fx(X) = 2, w.b(X;N, pi), 2% = 1, 
It follows that 
G.@ = 2) wBE;n, p,), 
t=1 


y(n, 0) = X wBe;n, p(k, — pi) = (k, — B(0)G(0), 


Bale) = » w,B(c, n, pi)pi/Ga(0), 


and 
A, y(n, ¢) i Bi w;b(c; n, Di)pi(pi =" k,); 


leading to the first inequality 
dX w.b(; n, pp: Dd wide + 1; n, pip. 
F(n,o =n+1 ; 
+ [k. — ke + LY wBejn, p(k, — pI/LL, widle;n, ppp: — k,)], (170) 


and 


(169) 


K(n, ) = nk, + (N — n){k, — 2) w.Be;n, p(k, — p.)} 


= nk, + (N — n)(k,(1 — G.©) + 9,.0G,(0)). (171) 


The optimum plans may be determined in the usual way from formulas (169)- 
(171), using (169) for determining “n-intervals” corresponding to successive 
values of c, thereafter using (170) for determining “‘N-intervals’” corresponding 
to (c, n)-combinations, and finally (171) to select the optimum combination of 
c, n, and N. For m = 2 the first inequality becomes linear in n and c. This 
important case will be discussed in greater detail later. 

The difficulties in handling the solution for m > 2 stem from (169) which is 
difficult to solve with respect to n for given c. It follows from the results in section 
9 that for p; < k. < pis: and N > @ we have c/n — h,;,,41 , see (80), and 


G.@ —> “ W;- 
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If the prior distribution is generated from m processes, each with its own process 
average, the result will therefore be that large lots from processes with a process 
average less than k, will be accepted whereas all other lots will be rejected. This 
is clearly a reasonable and desirable result. 

The average fraction defective in accepted lots therefore tends to 


‘ é 
dX wip;/ 2» UW; ;» 


and 
E(A0Q) —> = WiP; - 
Finally e 
K;N- > Wp; + = w;k, . 


i=it+l 
In the following we limit the discussion to the mixed binomial distribution 
with two components. The first inequality reduces to 


a+ Bbe<n<at+ pert )) (172) 


with 


and 


4 15 
16 
17 
18 


a = log {[wa(p. — k,)/wilk, — p)}}/log (9:/¢2) 


8 = log {p2q./q2p:} /log (41/2) 


TaBLe 13 


Values of n and N computed from (175) and (177). 


x(n, c) 


0.0420000 
0.0498000 
0.0508200 


0.0570660 
0.0622332 
0.0642962 


0.0643512 
0.0682810 
0.0703720 
0.0711034 


0.0717208 
0.0735688 
0.0744988 
0.0747482 


0.0754286 
0.0763354 
0.0767576 
0.0768184 


N c n 


2-7 5 
8-51 
52- 


10-16 
17-36 
37- 


17-24 

25-41 

42-106 
107- 


34-50 

51-92 

93-312 
313- 


63-99 9 34 
100-197 35 
198-1281 36 

1282- 


x(n, c) 


0.0774060 
0.0778586 
0.0780460 


0.0780744 
0.0784980 
0.0787234 
0.0788024 


0.0788928 
0.0791156 
0.0792278 
0.0792550 


0.0793548 
0.0794716 
0.0795260 
0.0795328 


0.0796192 
0.0796814 
0.0797058 


(173) 


(174) 


N 


115-191 
192-434 
435- 


135-207 

208-370 

371-1020 
1021- 


238-380 

381-733 

734-2791 
2792- 


422-706 
707-1496 
1497-12559 
12560- 


757-1330 
1331-3203 
3204- 
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where p, < k, < p,. For large values of c we have c/n =~ 1/3 = h, say. The 
second inequality and the cost function cannot be simplified but they are rather 
easy to handle numerically as will be shown by an example. 

Let us consider a mixed binomial prior distribution given by p, = 0.1, p. = 0.5, 
and w, = 0.8 (w, = 0.2). Let us further assume that k, = k, = 0.2. Since 
p, = 0.18 < k, we have a case where acceptance without inspection is cheaper 
than total inspection. 

The fundamental relations for determining the optimum plans are the following: 


—0.49 + 3.74c < n < 3.25 + 3.74c, (175) 
y(n,c) = 0.08B(e;n, 0.1) — 0.06B(e; n, 0.5), (176) 


0.08B(c;n, 0.1) — 0.06BC; n, 0.5) 
oes 2 ’ é 
Fin, ¢) = (n + 1) =o oogn(e; n, 0.1) + 0.0300¢; n, 0.5) (177) 


Table 13 shows the combinations of ¢ and n, y(n, c), and the corresponding 
N-intervals determined from F(n, c). By means of the values of y(n, c) in this 
table the points of intersection according to (123) have been calculated in table 
14. Combining the results of the two tables we get the system of optimum plans 
given in table 15. 

Together with the plans are shown the costs per item for lots of size N which 
is the midpoint of the N-interval. The costs are compared with the costs of 
acceptance without inspection and the saving in per cent is given in the table. 


TABLE 14 
Determination of points of intersection from (123) 
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TABLE 15 
Optimum sampling plans. 


N N K/N Saving% 100n/N 100 G,(c) 


2-7 
8-17 
18-36 
37-40 
41 
42-78 
79-92 
93-137 
138-197 
198-229 
230-384 
385-629 
630-733 
734-1005 
1006-1496 
1497-1570 
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12 
25 
38 
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33.3 0.0 80.0 


N 100 Pa(p1) 100 Pa(p2) (1—n/N)pn(c) E(A0Q) 


2-7 
8-17 
18-36 
37-40 
41 
42-78 
79-92 
93-137 
138-197 
198-229 
230-384 
385-629 
630-733 
734-1005 
1006-1496 
1497-1570 


0.112 0.092 
0.107 0.075 
0.096 0.074 
0.069 
0.073 
0.071 
0.073 
0.073 
0.075 
0.074 
0.076 
0.076 
0.078 
0.077 
0.078 
0.078 
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The maximum saving is 33.3% and already for rather small lots the saving is 
large. The sample size in per cent of the lot size tends nearly exponentially to 
zero. The average probability of acceptance is very near its asymptotic value 
0.80 for all lot sizes. The probability of acceptance for p = p, , and p = pz, has 





Fic. 9—Relation between lot size and sample size from table 15. 


been computed using the binomial as an approximation to the hypergeometric 
distribution. The table shows how the probability of acceptance for lots of 
acceptable quality increases with lot size whereas the corresponding probability 
for lots of unsatisfactory quality decreases. 


TABLE 16 
Optimum sampling plans. 
Mized binomial distribution with 
pi = 0.1, ps2 = 0.5, wm. = 0.8, and w, = 0.2 


ks = 0.3 


° 


n ec N 


° 


1- 143 
144— 171 
172- 175 
176- 335 
336- 350 
351- 549 
550- 719 
720- 879 
880-1420 

1421-2327 
2328-2588 
2589-3672 
3673-5213 
5214-5726 


213 

297 

492 

680 

859 
1227 2357-3747 
1533 
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0” 
3 
4 
7 
10 
11 
14 
15 
18 
165 19 
22 
25 
26 
29 
30 
33 
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1497-1570 32 


*) n = 0: accept without inspection. 
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The table also shows that the average fraction defective of accepted lots for 
most lot sizes is between 0.09 and 0.10 (as compared to the average fraction 
defective of submitted lots which is 0.18) and — in the case of sorting of rejected 
lots—the average AOQ which is between 0.07 and 0.08. The asymptotic formula 
(134) becomes 


W —n~2un(k, — ptaennct — mye /| wie, — be, - 0 | 
where 


8, = 0.2675 log (0.2675/0.1000) + 0.7325 log (0.7325/0.9000) = 0.0488 
which leads to 


n + 10.2 log n ~ 20.5 log N — 16.6. 


Fig. 9 shows that the relation between log N and n is nearly linear for large n. 
The dashed curve represents the asymptotic relation. 

The results of similar computations for k, = 0.3 and 0.4 have been given in 
table 16 and fig. 10 which clearly show how the sample size decreases with 
increasing k, for given lot size. 

The Polya distribution used as example in section 11 has mean j = 0.2 and 
standard deviation o, = 0.16 for N — «. The mixed binomial distribution used 
here has j = 0.18 and o, = 0.16 for N — o. The two prior distributions have 
thus for large N the same standard deviation and nearly the same mean. Com- 
paring the optimum sampling plans in tables 8 and 16 it will be seen that for 


large N a larger sample size is required, in the Polya case than in the binomial 
in accordance with the result that n increases proportional to N’”? and log N, 


respectively. A direct comparison of the relations between N and n for k, = 0.2 
has been given in fig. 11. 








1000 2000 5000 N 


Fig. 10—Relation between lot size and sample size from table 16. 





Fig. 11—Relation between lot size and sample size. 


It is tempting to reduce the system of optimum sampling plans to contain 
only one plan for each acceptance number in this manner obtaining a simpler 
system and less computation. One possibility is to use the relation 


n=atB6 (c+ 1/2) 


to determine n from c and afterwads to use (123) for determining the N- 
intervals. This procedure could clearly be generalised by choosing two values 


of n, say, for each c and determine the N-intervals from (123). 

Table 17 shows a system based on n = a + B (c + 1/2) fork, = 0.2. 

Comparing tables 15 and 17 it will be seen that the system in table 17 represents 
a considerable simplification of the optimum system and that the simpler system 
for most practical purposes is just as satisfactory as the optimum. 

Finally, a realistic example is given. (A collection of optimum sampling plans 
based on the mixed binomial distribution with two components will be published 
in a forthcoming paper.) Let the prior distribution be defined by p, = 0.02, 
p. = 0.10, and w, = 0.8 (w, = 0.2) which lead to p = 0.036, and let k, = k, = 
0.05. Forn = a + 6 (c + 1/2) we find 


n = — 0.33 + 19.90 c 


which for all practical purposes may be replaced by n = 20 c. Table 18 contains 
the corresponding sampling plans, the N-intervals being determined from (123). 
Since N increases nearly exponentially with n and the N-intervals become rather 
large the geometric mean N has been used as representative for each interval. 

The maximum saving as compared to acceptance without inspection is 
27.8% and already for lot sizes of about 1000 the saving is about 20%. The 
sampling fraction decreases exponentially to zero. With existing tables of the 
binomial distribution it is impossible to determine the N-intervals to more than 
2-3 significant figures for large n. Fig. 12 shows the relation between lot size and 
sample size. 
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TABLE 17 


A system of nearly optimum sampling plans based on 
n=a+ B(c + 1/2) andk, = 0.2 


ee ————e 


c N K/N Saving % 100n/N 100 G,(c) 


1664 7.6 
.1513 15.9 
1407 21.8 
. 1346 25.2 
. 1307 27.4 
1275 29.2 
1250 30.6 
1235 31.4 
. 1225 31.9 
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“ ' 0.1200 33.3 
100 P.(p1) 100 Pa(pz) (1 — n/N)p,(c) E(AOQ) 


50.0 0.119 0.098 
0.093 0.072 
0.092 0.071 
0.092 0.072 
0.075 
0.076 
0.076 
0.077 
0.078 
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13. ConcLuDING REMARKS 


The approach to the theory of sampling inspection presented here differs in 
several respects from the usual ones. 

First of all it is based directly on the hypergeometric distribution instead of, 
as usual, on approximations. The difficulty of handling the hypergeometric dis- 
tribution has lead most authors to introduce the binomial or the Poisson distri- 
bution as approximation at the very beginning of the development of a theory. 
As shown in the present paper this procedure is not necessary and in many 
cases it really has the opposite effect than aimed at. The combination of a prior 
distribution and the hypergeometric distribution has lead to the compound 
hypergeometric distribution and to the concept of reproducible distributions. 
Both these concepts deserve further study and may be generalized in various 
ways. 

Next the theory is based on a simple economic model which requires (in the 
simplest and most useful case) only one cost-parameter to be evaluated, namely 
the ratic between the cost per item rejected and the cost per defective item 
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TABLE 18 
A system of nearly optimum sampling plans. 


° 


y(n, c) N 


0.0186449 —223 
0.0206758 224-411 
0.0218534 412-678 
0.0225837 679-1065 
0.0230527 1066-1620 
0.0233601 1621-2425 
0.0235644 2426-3590 
0.0237019 3591-5270 
0.0237951 5271-7690 
0.0238586 7691-11200 
0.0239021 11201-16200 
0.0239320 16201-23400 
0.0239527 23401-33500 
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accepted. Further, the model covers both the case of destructive and non- 
destructive inspection. 

Within the framework of this model it is possible to solve a number of questions 
which for some time has troubled authors trying to set up systems of sampling 
plans namely such questions as: 


What saving is obtained, if any, by sampling inspection as gee to total 
inspection and acceptance without inspection? 


What is the optimum relation between sample size and lot size? 
How does the over-all probability of acceptance depend on lot size? 


How should the probability of acceptance for lots of acceptable quality 
increase with lot size? 


Questions of this nature can be solved for any prior distribution and a rather 
simple solution has been obtained for the compound binomial distribution. 
An important step in the further work would be to collect information on the 
form of prior distributions from practice to see how well such distributions 
could be fitted by simple compound binomial distributions. 

The theory may be generalised in various ways, for example to include double 
and multiple sampling. As remarked in the introduction a very important step 
would be to build into the theory a feed-back mechanism so that the sampling 
plans would be automatically adjusted to changes in the prior distribution esti- 
mated from previous inspection results. 

To use the theory presented here in its simplest form in practice we have to 
distinguish between two cases. If we have data for the prior distribution we 
estimate the parameters in a Polya distribution or a mixed binomial distribution 
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Fig. 12—Relation between lot size and sample size from table 18. 


with two components whatever of these two distributions seems to be the best 
choice. The determination of the optimum plan is then straight-forward accord- 
ing to the theory developed in sections 11 and 12. Tables corresponding to these 
distributions will be published so that the optimum plan may be read from the 
tables when the parameters of the prior distribution is known (together with the 
cost parameters). 

If, however, we have no data but only a vague knowledge of the prior distri- 
bution, we have to guess at a limiting distribution and choose the optimum 
plan accordingly. This guess requires at most two parameters in the Polya case 
and three parameters in the case of a mixed binomial with two components. The 
meaning of these parameters is evident. It is believed that it will be much 
easier to choose these parameters and explain their meaning in practice, i.e. 
to get an intelligent discussion between the engineer or the inspector and the 
statistician as the basis for determining a sampling plan, than to choose the 
(four) parameters of the usual systems which require guessing at AQL (or 7,), ' 
LTPD (or p,), a, and 8, or inspection levels. As information from sampling 
inspection accumulates it should be used at regular intervals to test and estimate 
changes of the prior distribution. : 
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Some Remarks on the Bayesian Solution of the 
Single Sample Inspection Scheme 


G. B. WETHERILL 
Birkbeck College, University of London 


Batches of items, assumed independent of one another, are to be sentenced using 
an acceptance sampling plan based upon a random sample of size n and an acceptance 
number c,. Granting that the prior distribution of the probability of a defective item 
is a mixed binomial distribution, and using a simple cost criteria, a rapid technique 
is proposed for computing n and c, . The method appears robust over wide changes 
in the parameters of the model. 


1. DEFINITIONS AND ASSUMPTIONS 


We assume here that batches of items are presented for acceptance inspection, 
and each item can be classified as either effective or defective. Each batch is 
to be sentenced by selecting from it a random sample of size n where n is small 
in relation to the batch size. If the number of defectives in this random sample, 
r, is greater than an acceptance number c, , the batch is rejected, and ifr < ¢, , 
the batch is accepted. 

We assume the prior distribution of the probability, p, of a defective item 
to be the mixed binomial distribution (Barnard, 1954) 


bits sti. Gt, Si ale as (1) 


where a; is the probability with which a batch has proportion defective p; . In 
this paper different batches are assumed to be independent. The loss associated 
with accepting a batch when p; is true is denoted by W,, , and the loss associated 
with rejecting a batch when p,; is true is denoted by W.; , where the cost of 
inspecting an item is taken as the unit of costs. 

We shall adopt the convention that p, > pz > --- > p, , and the break-even 
quality, at which a batch may without loss be accepted or rejected is denoted 
by po . Then W,, will be zero for all j such that p; < po , and W;; will be zero 
for all j such that p; > po. 

We find that the equations for an optimum solution assume a particularly 
simple form if we take 


Pi “a r\i/a 
Et = (p) () 


where p’ and a are suitable constants. This assumption is not very limiting since 
p’, a, and the a; can be chosen in some way to represent the available information 
of the process curve. A representation to any desired accuracy can be obtained 
by taking a sufficiently large value for k, the number of ordinates. 
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2. Equation For Tue Neutra LINE 


The neutral line is defined to be the locus of points (n, c,) such that at each 
of these points, the loss associated with accepting a batch is equal to that of 
rejecting it. The equation for the neutral line is therefore 


k 
x aiC..pi'di "(Wis — Wai) = 0. (3) 


If we insert assumption (2) and cancel out the combinatorial term, we have 


x aig; (Wis — Wai(p')**”* = 0 


This last equation can be written in the form 


k 


z= ue? = 0 


1 


(: =" 
A, = agiWi; — W:,) 
Equation (4) is of degree (k — 1). Given n, we can find the appropriate value 


of c by solving this equation for a root in the interval (0, 1), this root can be found 
easily by Horner’s or some similar method. Having found z, then we have 


aa log : x 
“a (5) 


Equation (3) will usually have a root in the range (0, 1), since when x = 0 the 
left-hand side of equation (4) is 


ay qi(Wis a Was) . 


and provided p, , the largest p-value in the prior distribution, is greater than 
Po , the break-even quality, W.; is zero and the left-hand side of equation (4) 
is positive. When x = 1 we have 


k 
x a.gi(Wi; — W;,). 


which is proportional to the expected cost of accepting minus the expected cost 
of rejecting when we have taken n observations and they are all effectives. This 
quantity will be negative in the range for which inspection is desirable, for 
otherwise it pays to reject rather than accept a batch for which the sample 
contains no defectives. We shall return later to the question of the neutral line 
and its properties. 


3. EQUATION FOR OPTIMUM SAMPLE S1zE 


For a given sample size, the acceptance number will be determined by the 
neutral line, and our problem now is to obtain in a suitable form an equation 
for the optimum n. 





BAYESIAN SOLUTION OF SINGLE SAMPLE INSPECTION 


Denote the expected loss for a given n by R(n), where 


k n 
Rn) =n + >> Dd prob (r defectives | p; true) X (loss at (n, r) | p; true): 


#=1 r=0 
then the optimum value of n is such that 
Rtn — 1) > R(n) < Ra + I). 


We shall therefore consider the difference R(n + 1) — R(n), of the risk for taking 
a sample size of (n + 1) minus the risk for a sample size of n. This difference in 
cost is made up of the cost of the extra observation plus a contribution from 
sample points such that a different decision is made at n and (n + 1). The 
difference R(n + 1) — R(n) will change sign from negative to positive at the 
optimum n. 

For example, suppose n = 10, c = 5, andn = 11, c = 5. All samples rejected 
_ at m = 10 would be rejected at nm = 11 whatever the result of the eleventh obser- 
vation. However, if we had a sample (n = 10, r = 5), this would be accepted 
at the n = 10 boundary, but only accpeted at the n = 11 boundary if the result 
of the eleventh observation was an effective. Thus for those p; for which accept- 
ance is the wrong decision, cost would be saved by going from a sample size 
of 10 to 11 for those samples which arrive at (11, 6) via (10, 5). Similarly, there 
would be extra cost for those p; for which acceptance is the right decision for 
samples arriving at (11, 6) via (10, 5). 

We now derive an equation for the optimum n by equating the difference 
{R(n + 1) — R(n)} to zero. In order to avoid awkward inequalities, we shall 
suppose that there is a point (n, c,) at just that value of c, given by the neutral 
line, and that the slope of the neutral line is less than one. Thus (n + 1, c, + 1) 
is a rejection point and (n + 1, c,) is an acceptance point. 

Now since (n, c,) is exactly on the neutral line, we could without loss accept 
or reject. It suits our purpose here to have a randomised decision, with a prob- 
ability of 4 of accepting or rejecting the batch. These assumptions listed above 
can be regarded as reasonable approximations necessary to obtain a suitable 
equation. 

Consider first those sample points for which the extra item inspected is a 
defective, and where a batch is accepted at a sample size of n, and rejected at a 
sample size of (n + 1). The contribution to the change in risk R(n + 1) — R(n) 
from this component is 

1 * n, Cn N—Can 
2 a a; "C..p5" Gi pW; — W;,). 


Now consider those sample points for which the extra item inspected is an 
effective, and where a batch is rejected at a sample size of n and accepted at a 
sample size of (n + 1). The contribution to the change in risk R(n + 1) — R(n) 
from this component is 


1 n Cn A—-Ca 
2 a a; "C..pi"Q; “"Qi(We: — W,i,)- 


t=1 
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Therefore, since (p; — q;) = 2p; — 1, 


Ra +1) — Rw) = 5 DarC.prar "Wa — Wid@.-) +1. ©) 


If we equate this to zero, and use (2) and (4) we have that 
k 
2d Bx" = 1/°C., (7) 


where 8; = a;p,93(W1; — W2:) = pA; and this equation is of the same form as 
equation (4). 

Equations (4) and (7) determine the optimum sampling scheme for given 
prior probabilities and loss functions. A convenient method of solution is to 
proceed as follows. We pick two or three values of n, and determine z and c, 
from equation (4) for each. Plot —log (>-* B,2*) and log "C,, against n on the 
same graph paper. The intersection of these two functions gives the optimum 
value of n. Two or three more trial values of n will usually determine the point 
of intersection sufficiently accurately. Iterative procedures could be developed 
if desired. Examples are given below in section 5. 

If k = 2, and if p, > po > pz, equation (4) becomes 


ag Wi 

<a mn 2 8 

A2Q2 Woe : 8) 
so that from equation (5) 


i. a a,Wi, 3); ‘ 
Cn ins {log aa. +n log " (9) 


Equation (7) is, in terms of zx, 








apgiW x — a2q2p2,Wox = 1/"C., 
and if we substitute from equation (8) we have 


a2W 2» (2) < 
(Wi). — py \@) = Cm _ 
so that 
02W o5 ) (2) _— n 

log (ae SE — +n log 7c log "C.,. (11) 

Write this equation as 

WwW 

af art} (%) 12 
y(n) ” (a,W,.)"(p. = Po) ta qi sa 
2(n) = log "C.,. (13) 


We now solve as before to obtain the value of n for which y(n) = 2(n), de- 
termining the value of c, from equation (9) and using equation (19) of the 


appendix. However, a useful approximation holds in this case. In the appendix 
we show that 


log "C., = .n[(1 — b)A + bB] + [(a as B+ (3—a)A] +A 
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where A, B and A are almost constant for small changes in n not close to the 
origin. If we represent equation (12) as 













y(n) = —l + mn 
the optimum value of n is 


_1+G+@B+(4—-@A+A_ 
um — [1 — DA + OB] 


The procedure to obtain a solution is therefore as follows. Evaluate equations 
(12) and (13) for several values of n in order to make a guess at the optimum 
value. Evaluate A, B and A for the guessed value of n and substitute in equa- 
tion (14). This procedure can be repeated if required. 

The simplification introduced by assumption (2) is to reduce otherwise compli- 
cated equations to polynomials, for which a simple method of solution is readily 
available. 


(14) 


5, EXAMPLES 


Example 1. Suppose that we have a two-ordinate distribution, p, = 0.09132, 
3 0.01, a= A p = 0.01, Wa = Wee = 2000, a= 0.20, a = 0.80. The 
neutral line is 













c, = 0.6034 + 0.03873n 


and the equation for the optimum sample size is 


—0.9102 + 0.0788n = log "C., 


The following results are obtained 












n c y(n) 2(n) (approx.) 

30 1.7225 1.4543 1.4771 

70 3.2148 4.6069 4.9986 
100 4.3339 6.9714 7.0281 
105 4.5204 7.3655 7.3689 
106 4.5578 7.4443 7.4390 
110 4.7070 7.7595 7.7095 








The value for which y(n) and z(n) are most nearly equal is n = 105. If the 
approximate method is applied at n = 100, (using equation 14), we arrive at 
n = 105.5. We would therefore take n = 105, c, = 4, to be the optimum sampling 
scheme. Actually R(105, 4) = 124.7, and the true sampling scheme giving 
minimum risk isn = 97, ¢, = 4, with a risk of 122.7. 
We briefly consider two more examples. 

Example 2: p, = 0.3, p2 = 0.20, ps = 0.1, p, = 0.0588, p’ = 4, a = 1, a, = 0.10, 
a2 = 0.25, a3 = 0.10, a= 0.55, Wi = 900, Wie = 300, Wos = 300, Wu = 600. 
The neutral line is 















90g? + 7593 x — 309q3 x” — 330g3 x*® = 0 
and ce = — log (x)/log (3). The equation for the optimum n is, 


90qi p. x + 75q2 po x” — 3093 ps x* — 330g? p, x* = 1/°C., 
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From these equations we have the following values. 
n — log (L.H.S.(7)) log "C.,(approx.) 


18 2.37 3.10 
36 5.83 6.04 
46 7.64 7.69 
50 9.03 8.01 


The optimum solution seems to be n = 46, c = 6. 

Example 3: Two-ordinate example used in section 8. p, = 0.09, p. = 0.01, 
a, = 0.40, a, = 0.60, p’ = 0.09890, a = 1.0070, Wi, = We. = 3000. 

We have the following table of results. 


n y(n) 2(n) (approx.) 
100 5.95 6.38 
120 7.50 7.71 
140 9.05 9.04 
150 9.82 9.30 


If formula (14) is applied at n = 150, we arrive at n = 140.2, as an optimum 
sample size, and n = 140 is the value for which y(n) and z(n) are most nearly 
equal. Further calculations on this example are given in section 8. 


6. THe NEuTRAL LINE 


Calculations such as those given in the previous section seem to indicate 
that the neutral line is asymptotically linear in n. Below we give a proof that 
for any finite number of ordinates this conjecture is true. The proof depends 
upon the following lemmas, which are easily proved. 


Lemma 1: The function y(p) = p”(1 — p)'~”, for 0 < p <1, and0< W <1, 
where W is a constant, takes its maximum at p = W. 


Lemma 2: For fixed values of p, and pz such that 0 < p, < p, < 1, and values 
of p in the range 0 < p < 1, the ratio 


no = (2/8) 


is unity for some value of p denoted p,,..Forp > pi2,R(p) > 1, and forp < p,2 , 
R(p) <1. 


Theorem. For a prior distribution of type (1) with p, > p, > --- > p, and with 
the a; and associated loss functions non-zero, the neutral line yields an accept- 
ance number c, which is asymptotically linear in n. 

To prove this we notice that if we divide equation (3) by the probability 
term involving p, , we have (k — 1) terms of the type R"(p) where R(p) is a 
term of the type given in Lemma 2. We can therefore specify the p,; , 7 = 2, 
3, --+ k, at which these ratios are one. 

Suppose that there existed a value of n, say n’, such that for all n > n’ , 
C. > Pion. By increasing n, the term of (3) involving p, would then be large 
compared to all the others, and thus such a solution is impossible. 
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By reasoning in this way, we see that the condition that the neutral line 
yield a solution for large n is that at least one probability term on either side 
of the break-even quality are comparable. Suppose p; > po > pi+, , then the 
condition for a solution for large n is that c,/n is in the neighbourhood of p,.;.; . 
But if this is so, all other terms of (3) but those involving p; and p;,, become 
negligible with increasing n. Thus c, ~ np;,i41 an —> ©, 

This property of c, holds for any fixed and finite set of ordinates p, , --- , Dm , 
but the rate at which the asymptote is approached depends upon these ordinates, 
the break-even quality, and the relative orders of the series of products a;W,; 
for ¢ such that p; > po , and a,W2; for z such that p; < po. 

If we consider example (2) by the above theorem, the c, for this example 
should be asymptotically dependent upon the terms of the neutral line in- 
volving p, and p; , so that for large n we should find 


a doW so a 
= 7 § log ~—? 
a p { °8 a, os TaN Is 


For example 2 of section 5 this equation is 


C, = —1.32 + 10.15n 


and below in table 1 is listed values of c, given by the exact theory and by this 
asymptotic theory for various values of n, and the tendency to the asymptote 
is clearly indicated, although in the region of the optimum sample size (46) 
there is still an appreciable difference. It is interesting to note though, that the 
neutral line for this example quickly becomes approximately parallel to the 
asymptote. 

Now since the left-hand side of equation (7) is of similar form to the neutral 
line, the same two terms will dominate this equation as those which dominated 
the equation for the neutral line, and a large sample approximate solution can 
be constructed in this way. Thus when we perform initial calculations to find 
the region of the optimum sample size, in the manner indicated above, we can 


’ very quickly and easily carry out these calculations for a large sample size by 


using the asymptotic approximation. 

This asymptotic theory also seems to indicate that the break-even quality 
has an important function in both equations which determine an optimum 
solution (c, , m), and it may be that changes in the break-even quality cause 
drastic changes in the optimum sample size. 


TasLe 1 
Values of cn given by the exact and asymptotic theories for Example 2 


n Ca 
asymptotic asymptotic 


0.20 46 : 5.67 
50 : 6.27 
100 : 13.88 
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Figure 1—Risk function plotted against n. 


This argument can be generalised to dual with a general prior distribution 
P(p) instead of (1), and general loss functions W,(p) and W.(p), where W,(p) = 0 
for p < po, and W.(p) = Ofor p > p, . It turns out that for a continuous dis- 
tribution P(p), c, ~ npo . This argument applies, for example, if P(p) is a Beta 
distribution. 

In order to investigate the rate at which this asymptote is approached, we 
may notice that the neutral line will involve a term pg” ™, and for large (c, , n) 
this will be approximately a normal density function with mean pp, , and a 
standard deviation of order 0(1/+/n). For a given n, if c, is close to npo , then 
we must have 


E(P(@)W.(p)) = EP) W2(p)) 


where expectations are taken over this normal distribution. Therefore if 
P(p)W,(p) and P(p)W.(p) are the reflection of each other in p, , for some small 
region close to p> , this asymptote will be approached fairly rapidly. 

A rigorous argument can be given for this, involving the method of steepest 
descents. 


7. THE Risk FuNcTION 


I shall define the risk function R(n), of a sampling scheme to be the total 
expected loss for a given n, 


k len] 


Rm) =n+ Zz a a; "Cpiqi "Wi(p,) 
t=1 r=0 (16) 
k n 
+2 D a"Cpigs' Wip) n=1,2,3, +> 


#=1 [en] +1 


where [c,] is defined as the nearest integer below the neutral line value, or zero, 
whichever is the greater. The risk of taking a decision without sampling is 


R, = Min 1D a.Wi;, > aW.) (17) 


t=1 i=1 


From the previous section it follows that c, is asymptotically linear in n. It 
also follows that the probability of making wrong decisions becomes negligibly 
small as n increases. Thus for large n, R(n) ~ n. Neglecting small irregularities 
due to discreteness, the general shape of the risk function appears to be as in 
Figure 1. 

The rise of R(n) immediately close to n = 0 is due to the fact that in general 
the neutral line has no real solution c, for n below a certain value, say n’, at which 
c, = n’. For values of n less than this, there is no point for which it pays to 
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reject, and R(n) therefore increases until the sample size is large enough to 
contain at least one rejection point. The following example illustrates this. 


Example: Suppose that we have p, = 0.09, p, = 0.01, a, = 0.2, a. = 0.8, 
W,, = 2000, W2. = 6000, then R, = min(400, 4,800) = 400. Consider a sample 
size of 1, and we have either 


a good item risk of accepting = 0.2. 2000. 0.91 = 364 
a bad item risk of rejecting = 0.8. 6000. 0.01 = 48 
risk of accepting = 0.2. 2000. 0.09 = 36 


There is therefore no rejection point for a sample size of one, and R(1) = 401. 
If we consider a sample size of 2, we find that it does pay to reject when we 
have 2 defectives, and R(2) = 397.24. Thus we have 


R, = 400 R(1) = 401 R(2) = 397.24. 


In general, excluding the root at the origin, there are at least two roots of 
R(n) = Rp (and maybe more owing to the discreteness of the problem). We 
could define two roots, n, such that for n < n, , R(n) > Ry , and nz such that 
forn > nz, R(n) > Ry , which are the maximum and minimum sample sizes 
worth taking. Clearly, n, is greater than the value of n for which C, = n, and 
MN, < Ry . More precise information on the position of these points would be 
useful, but this is a difficult mathematical problem. If desired, an iterative 
procedure could be set up using binomial tables (or tables of the incomplete 
Beta function). 

Another interesting feature of the risk function is that because of its shape 
we expect that the rate of increase in the expected risk for sample sizes above 
the optimum will usually be less than one (in the downward direction R(n) 
may increase quite sharply). Thus if we increase our sample size by m units 
above the optimum, we would expect the decrease in efficiency to be less than 
m/R(n), so that, for instance, if R(n) is of the order of 100, an increase of 10 
units above the optimum sample size would be expected to lead to less than 
10 % reduction in efficiency. This point is discussed further in the next section. 


8. RoBusTNESS 


We consider here the situation where we suppose that a mixed binomial distri- 
bution with unknown parameters a; , p; and W,,; accurately describes a process 
curve and associated loss functions. Our estimates of these parameters will 
contain errors and we may ask what order of discrepancies from the true values 
can be tolerated. 

This problem is difficult to analyse mathematically, and an extensive pro- 
gramme of calculations is really required. Here we shall limit ourselves to the 
two-ordinate distribution, partly because it is the easiest to deal with, and 
partly because it is likely to be more susceptible to errors in the parameters 
than mixed binomial distributions with a greater number of ordinates. 

Tables 2, 3 and 4 apply to the two-ordinate distribution given in example 3 
of section 5, with certain parameters varied one at a time. Table 2 gives the risk 
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Tables of Certain Properties of Sampling Schemes for Example 8. 
TABLE 2 
Risk along the neutral line for two values of W = Wu = Wa 


W W 
1102 649 1102 649 


327.9 205.5 123.1 
232.5 153.4 133.8 
177.9 129.4 149.1 
154.5 123.9 


along the neutral line for various values of the sample size and two values of 
W. For this example, sample sizes of 10 or 20 either way from the optimum 
make very little change in the risk. The risk function increases less rapidly 
above the optimum n than below it, and this optimum is remarkably flat. 

Table 3 has. been computed by fixing the values of a; and p,; , and finding the 
value of W necessary to give an optimum sample size of a given value. The 
ratio of this value of W to some fixed value of W is also given. Therefore sup- 
posing that in our ideal model W was actually 1102, the optimum sample size 
should be about 100. If throughout the calculations we had used a (wrong) 
value of W, equal to 0.34 X 1102, we should have arrived at an optimum sample 
size of 60. A factor of 3 in W seems to make a change of about 50% in the sample 
size. For this example, a factor of about 2 would probably be considered to 
have a negligible effect on the risk. 

Table 4 gives similar calculations for the effect of variation in a, , while the 
p; and W were held constant. A change of about 25% in a, seems to lead to a 
change of about 20 in sample size, and it seems from Table 2 that changes in 
n of this order are negligible if the optinum sample size is above about 60, but 
that changes in the parameters are more important for the lower sample sizes. 
Note to table: Equation (10) gives an optimum solution of m = 100 when 
W = W,, = Wa = 1102, and n = 80 when W = W,, = Wo. = 649. The actual 
optimum for W = 649 isn = 92, and the risk is 120.1. 

With reference to the comments made in section 6 about the break-even 
quality, we may ask what is the effect of simultaneously reducing W;, by, say 
a factor a, and increasing W.. by the same factor. From equation (11) we see 


TABLE 3 
Values of W for equation (10) to have a solution at a given n, a; and p; as in example 8. 


R= W/W’ R= W/W’ 
W’ = 1102 W’ = 649 W W’ = 1102 W’ = 649 W 


149 
206 
375 
649 









ues of 
ytimum 
rapidly 
lat. 

ling the 
ie. The 
re sup- 
ple size 
(wrong) 
sample 
‘sample 
ered to 


hile the 
ad to a 
anges in 
60, but 
le sizes. 
10 when 
e actual 


ak-even 


by, say 
) we see 


ole 3. 










BAYESIAN SOLUTION OF SINGLE SAMPLE INSPECTION 


TABLE 4 


Values of the prior probability a, for equation (10) to have a solution 
at a given n, p; and W as in example 3. 


that this will have roughly 3 times the effect of altering W = W,, = Wa. by a 
factor a. 

We may ask how far these conclusions are true generally and one result is 
given below which may be useful in this respect. For the two-ordinate distri- 
butions, changes in the a; or W;; merely add terms independent of n or c, to 
equation (10). Changes in W = W,, = We» do not affect the neutral line, and 
changes in a, will only affect it slightly for moderate sample sizes. Clearly, the 
change in sample size depends on the inclination of equations (12) and (13) 
at the optimum sample size. Now equation (12) is a straight line with slope log 
q2/q; and from the appendix we have that equation (13) is approximately a 
straight line with slope 
















2s 67 — B)°-) wh a Qi 
log b°(1 — b) where b lene log “ 




















Therefore the relative slope is by the usual tangent formula 


log | + log 1 — b)'* 
sccemtaipastimsiiigiiaiaiaetaladuit (18) 
1 — log 2. log vd — b' 


and the greater this quantity, the less will be the effect of changes of a; and W 
on the optimum sample size. 

This section only scratches the surface of work which could be done on robust- 
ness of this and allied models, and most of this will probably have to be done 
by an extensive series of calculations. 


9. CoNCLUSIONS 





A method is given which provides a simple method of obtaining a single 
sampling scheme with minimum risk for the particular model assumed in section 
1. It appears from section 8 that the optimum is flat, and quite large changes 
in some of the parameters have little effect on the risk. The most critical param- 
eter is probably po , the break-even quality. 

I am indebted to Dr. D. R. Cox for many helpful comments during the 
preparation of this paper. 
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APPENDIX 
Approximation to log "C’.. By Stirlings theorem we have 


1 


nlc e t/a exp {71 - ast} 


Therefore 
logion! = —n loge + (n + 3) logn + 3 log 2x 


1 1 
+ [ios — aon] M6 
logioc! = —c loge + (¢ + 4) loge + 3 log 2x 
‘| 1 
+ se _ as | log e 


logic (n — c)! = —(nm — c) loge + (n — c + 9) log (n — ©) + $ log 2r 


1 1 
. E- —oc) 360 — | loge 


and hence 
1 Cc 1 c 
logio "C. = —(n —¢e+ i) log (1 _ *) ~ (c+ 4) log - 
: . ene (19) 


ean wei ie mee 
2 | toe 5 log 2 ar = 


For a two-ordinate prior distribution this expression reduces further, since 
we can write equation (8) as 
c=a+bhn 
or 


c/n =b+a/n and (1 -£)-(1-»-9). 
n n 
On substituting in the above expression for log "C, , we have, 
log "C, = —[n(1 — b) —a + 3] log (1 —b -) 
— [in tat log (d+ 2) +0 


where A is almost constant for small changes in n when n is not close to the origin. 
Rewrite this 


log "C, = [1 — A + DBn + [2 +39B+ AG —DI +A (20) 


where A = — ae (1 — b — a/n) and B = — log (b + a/n). 
Now since A, B, and A change very slowly with n as long as n is not small, 
this last equation is approximately a linear function of n. 
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Serial Sampling Acceptance Schemes Derived 


From Bayes’s Theorem 


D. R. Cox 
Birkbeck College, University of London 


In many industrial environments it is reasonable to assume that batches of items 
produced sequentially will be positively correlated. Taking advantage of this assump- 
tion it is shown how the sentencing of a batch as good or bad can depend upon the 
number of defectives in a random sample of fixed size drawn from the particular batch 
to be sentenced and upon the number of defectives found in batches before and after. 


1. INTRODUCTION 


The central idea of this paper is due to Professor Barnard, who pointed out 
some years ago the need to consider batch sampling inspection schemes in 
which the decision taken on a particular batch depends not only on the sample 
outcome for that batch, but also on the outcomes for batches close to it in some 
natural ordering, such as order of production. The assumption is that the true 
batch qualities of batches close together will be positively correlated, and that 
a more sensitive scheme can be obtained by exploiting this correlation. 

At least three methods are available by which such serial schemes can be 
constructed : 

(i) a reasonable-looking sentencing rule may be formulated and its properties 
investigated for various sequences of true batch quality. The deferred-sentencing 
schemes discussed recently by Hill, Horsnell and Warner (1959) are examples; 

(ii) the problem of estimating the true batch quality for a particular batch, 
given a sequence of batch outcomes, is a problem of filtering (Wiener, 1949). A 
scheme could be based in particular on a linear filter, i.e. on an appropriate 
linear combination of the batch outcomes (supplemented by a limit on the out- 
come for the batch being sentenced). If this approach is adopted, transformation 
of the observations to obtain approximately constant variance may be theo- 
retically desirable, depending on the relative linearity of the transformed and 
untransformed stochastic processes; 

(iii) we may set up a stochastic process representing the system and apply 
Bayes’s theorem to obtain a sentencing rule. 

This last approach will be followed in the present paper. The model assumed 
is the simplest possible, and I shall not pretend that it is realistic. The calculation 
of the sentencing rule and its properties is entirely straightforward, but becomes 
tedious if the outcomes for several batches are to be taken into account. In this 
paper one numerical example is studied in some detail and some general con- 
clusions conjectured. An extensive numerical investigation by electronic com- 
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puter is, however, desirable before the nature of the schemes, and of their 
sensitivity to the underlying assumptions, can be properly understood. 


2. SPECIFICATION OF MopEL 


Consider a sequence of batches {--- , B,-., B, , Basi , ---} to be sentenced 
and suppose that by counting the number of defectives in random samples of 
predetermined fixed size, we obtain the sequence {--+ , Za-1 , Zn» nti, **°}; 
x, being the number of defectives in the sample from B, . Assume that z, follows 
a Poisson distribution of mean m, , where m, is the true batch quality, and 
that the {z,} are independent. The general argument is unaffected by using the 
binomial or hypergeometric distribution instead of the Poisson. To complete 
the model we must specify the stochastic process {m,}; the following type of 
model is mentioned briefly by Barnard (1954). 

We assume that m, has only two possible values, a and b (a < b). Batches 
with m, = a will be called good and those with m, = 6 will be called bad; it is 
assumed desirable to accept good batches and to reject bad batches. Further 
the sequence {m,} will be assumed to be a realization of a simple Markov chain 
with transition matrix 

; -t t, | 
t 1 os t, 


An alternative way of putting this is that runs of good and bad batches are 
independently geometrically distributed with mean run lengths 1/t, and 1/t, . 
The prior odds that a batch is bad are thus ¢,/t, . If and only if ¢, + ¢, = 1, the 
sequence of true batch qualities is random. 

The prior probability of a particular sequence of m’s can now be written down. 
Thus if we consider just the batches B,_, and B, , we have the following: 


prob (m,_, = m, = a) = sl =) . 
a b 
t,(1 =2 A) 
prob (maar = ma = 8) = AEE (1) 
prob (m,-, = a, m, = b) = prob (m,_, = b, m, = a) = hb. 


i+ th 
For example, the probability that both batches are good is the equilibrium 
probability that B,_, is good, namely ¢,/(¢, + %), times the conditional prob- 
ability that B, is good given that B,_, is good and this is 1 — ¢, . Notice that the 
model is reversible in time. An analogous model for process control has been 
treated by Girshick and Rubin (1952). Their model is simpler to deal with in 
that once a transition to the bad state has occurred, the system returns to the 
good state only when the first transition has been detected, and the process 
corrected. While a system of process control is no doubt to be preferred to pure 
acceptance sampling, only the latter will be considered in the present paper. 


3. ONE-Step Back SENTENCING RULE 


Suppose that we propose to sentence the batch B,, on the basis of x,_, and 2, . 
The prior probabilities of the various possible values for m,-_, and m, are given 
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by (1). The likelihoods can be written down from the equation of the Poisson 
distribution; for example 


ie a 


ni Vn! 
Thus from Bayes’s theorem the posterior probabilities of the various possible 


values for m,_,; and m, can be found. In particular prob (m, = b | x,-1, 2), the 
final odds that B, is bad, is given by 


prob (2-1 » Zn | Ma-1 = 2, mM, = b) =e 


t(1 — te *b****"-* + the ab" 
t(L at wr + ier re 
Now if w, is the loss from rejecting a good batch and w, the loss from accepting 


a bad batch, it is well-known that the optimum sentencing rule of the present 
type is to reject when 


prob (m, = b | t.-1 , 2.) = 


(2) 


prob (m, = b) > w/w, , 
to accept when 


prob (m, = b) < W,/W, ’ 
it being immaterial what is done if the final odds are equal to w,/w, . 


The determination of the rejection region is much simplified by the property 
that if (v,-1 , %,) is a rejection point, so also is any point (x74_, , 2) for which 
xi_, > 2,-, and x{ > x, . There is a complementary property for the acceptance 
region. 


When the rejection points have been found, the probabilities of rejection can 


be determined for the type of sequence {m,} assumed in deriving the rule, and 
also for other types of sequence {m,}. In particular, the expected loss per batch 
from wrong decisions is, for a two-point prior distribution, 


w, prob (acceptance | m, = b) + w, prob (rejection | m, = a). 


4. More Comptex BAcCKWARD-LOOKING SENTENCING RULES 


In principle the argument of section 3 is immediately generalised to derive 
a sentencing rule for B, based on (,-: , Ua-n+1 » *** » Zn). The final odds that 
B, is bad are given by the ratio of two expressions each with 2° terms. Formula 
(2) is the special case k = 1. 

For desk calculation this process rapidly gets tedious, although in special 
cases it is possible to omit many terms, for example those that correspond to 
several low-probability transitions in the Markov chain. 

Some general properties of the schemes can be stated. First there is a critical 
number r such that any batch with r or more defectives is rejected whatever the 
outcomes for previous batches. To see this, note that because the process {m,} 
is Markovian, we reject a batch B, for all (1,-; , 2.-2 , -*:) if and only if we 
would reject it if we knew that B,_, is good, ie. that m,_, = a. Now, con- 
ditionally on m,_, = a, the final odds that B, is bad are 


(1 — &)e*b 
te “a** 
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Thus r is the smallest integer greater than or equal to 


ton | es | / to (2). 


In a scheme with a small value of k, the unconditional rejection number may 
be less than r. 

The second property is that the rule may tell us to reject B, even if x, = 0. 
The condition that this should never happen is that we accept a batch with 
x, = 0 even if we know that B,_, is bad, i.e. that m,_, = b. That is we must have 

(1 — te~* W, 
ty e° < WD, (3) 
If (3) is not satisfied and we regard it as wrong or impolitic to reject a batch 
with no defectives, we must either make an ad hoc modification of the scheme 
or insist on a second sample for such batches. In the latter case the posterior 
odds that the batch is bad are recalculated in an obvious way and the batch 
appropriately sentenced. 

A third result that can be obtained from the Markovian character of the 
process {m,,} is a lower bound to the expected loss for a scheme with very large 
k. For the expected loss must exceed that for the Bayesian scheme in which 
m,—; is assumed known. 


5. BACKWARD AND FoRWARD LOOKING SENTENCING RULES 


The arguments used in the last two sections can be applied to deferred- 
sentencing schemes in which the x’s for batches after B, can be used in sentencing 
B, . By the reversibility of the model of section 2, the scheme based on (x, , 
Xn+1 > °** » n+x) has identical form and properties to that based on (*, ,Tn41, °°; 
Zp-z)- The simplest scheme different from those of preceding sections is the one- 
step-forward one-step-back scheme based on (2-1 , 2, » Un+1). This will have 
lower expected loss than a two-step-back scheme. 

The properties given at the end of section 4 can all be generalised. 


6. AN EXAMPLE 


For our numerical example we take a = 0.2, b = 2, t, = 0.02,.4, = 0.2. Thus 
the prior odds that a batch is good are 10:1, the mean length of runs of good 
batches is 50, and of bad batches 5. Further we take w, = w, = 100 so that the 
sentencing rule is based on the value of m, of greater posterior probability. 

Table 1 gives the rejection points for several simple schemes. Table 2 gives 
the properties of the schemes, in particular their expected losses, assuming that 
the process {m,} has the form postulated in deriving the schemes. 

In this example the single batch scheme has low consumer protection, and the 
advantages of the serial schemes are shown in improving this, i.e. in the increased 
chance of rejecting bad batches. 

While one would try to determine values of a, b, ¢, and ¢, by analysing suitable 
data, and from general experience of the process, it is clearly important to 
examine what happens if the true batch quality does not have the statistical 
properties postulated above. This is done rather sketchily in Table 3. The 
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TaBLe 1 
Rejection regions for some simple schemes 


Type of scheme Batch B, is rejected if 
Single batch, 
One-step-back 
Two-steps-back 


In = 1, Mn > 2. 

In = 2, Mri 21; r= 2, In1 = 0, I-2 => 2; 
In1 > 3; Jn = 1, Met = 2, Moe D> 1; m= 1, 
n-1 = i, Tn-2 > 2. 

In = 2, Ina + Inui > 2; r= 1, a+ May 2] 3; 
Lunt) Xai > 2. 


“ee 


i IVIVIV 
rm Oo bo hy 


One-step-forward 


i IV 
S& 


7 


Mn1 = 6b; 2 > 3, Mai = a. 
Mn-1 = Mayt = 4; Ta > 1, Mn-1 ¥ Mnz1; Xn = O, 
May = b. 


Mn-1 given 
Mn-1 ANd Mays given 


IVIV - 
i 


~ 


f 
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first part of the table shows the probability of rejection as a function of batch 
quality m (i.e. the power function) when all neighbouring batches have the same 
quality m. This gives also an idea of the behaviour of the scheme when the true 
batch quality drifts slowly, instead of jumping discontinuously from a to b and 
back. The second part of the table shows properties of the scheme when, although 
the odds that a batch is good are 10 to 1, the transition probabilities are not 
those assumed in deriving the scheme. Three other situations are considered in 
which 1/¢, , the mean length of a run of bad batches, is ~ , 2 and 11/10, the last 
corresponding to a random sequence of good and bad batches. Of course many 
other forms of input could be examined. 


7. Discussion 


It would be wrong to read too much into one numerical example, but the 
following general remarks suggest themselves. 


(i) The discreteness of the problem makes detailed interpretation of the 


TABLE 2 
Properties of schemes under assumed input characteristics 


One-step- 
forward Mn-1 and 
Single One-step- Two-steps- one-step- Mn—1 Mn+t 
batch back back back given given 
Prob of rejecting 
good batch .0175 .0223 .0071 .0070 .0048 .0112 
Prob of accepting 
bad batch .4060 . 2764 .3934 . 2662 .2436 .0776 
Expected loss 5.28 4.54 4.22 3.05 2.65 1.72 
Prob of rejecting 
good batch in a -0175 .0204 .0053 .0037 .0012 .0000 
run of good batches 
Prob of accepting 
bad batchinarun  ~.4060 . 2452 ; .1768 .1353 .0000 
of bad batches 
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TaBle 3 
Properties of schemes for various input characteristics 


a Probability of rejection at constant quality level 
True batch mean 0.2 0.4 08 12 16 20 24 28 32 36 40 44 
Single batch 0175 .0616 .1912 .3374 .4751 .5940 .6916 .7689 .8288 .8743 .9084 .9337 
One-step-back  $ .0204 .0780 .2599 .4593 .6285. .7548 .8421 .8998 .9369 .9603 .9750 .9841 
One-step-forward, 
one-step-back .0025 .0232 .1605 .3851 .6767 .8232 .9092 .9546 .9785 .9900 .9954 .9979 


b) Effect of change in transition probabilities with fixed 10;1 ratio of good 
to bad batches 


(i) Probability of rejecting good batch 
te ede <1) 0.02 
ts 10¢ 0.2 : 10/11 
1/ty oo 5 11/10 
(assumed model) (random) 


Single batch .0175 .0175 .0175 
One-step-back : .0223 .0251 .0290 


One-step-forward 
one-step-back .0037 .0070 .0122 .0202 


(ii) Probability of accepting bad batch 
te ee <1) 0.02 ‘ 1/11 
ty 10¢ 0.2 d 10/11 
1/t, Cc) 5 11/10 
(assumed model) (random) 


Single batch 

One-step-back 

One-step-forward 
one-step-back 


Single batch 

One-step-back 

One-step-forward 
one-step-back 


-4060 
2452 


1768 


(iti) Expected loss 
ta de XK 1) 

ty 10¢ 

1/ty co 


4060 -4060 


2764 


- 2662 


0.02 
0.2 
5 


(assumed model) 


5.28 
4.54 


3.05 


3483 


-6106 


1/11 
10/11 
11/10 

(random) 


5.28 
5.80 


7.39 


results, particularly of the probabilities of wrong decisions, difficult. Thus the 
effect of taking account of z,_, is to increase the probability of rejecting a good 
batch and to decrease the probability of accepting a bad batch. The further 
introduction of z,-. decreases the first probability and increases the second. Of 
course each introduction of new information decreases the mean loss, and this 
quantity does behave in a fairly smooth way. 
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(ii) If we consider backward-looking schemes, in which to sentence B, the 
values (%n-z, *** , Za) are used, we have expected losses as follows: k = 0, 5.28; 

= 1, 4.54; k = 2, 4.22, while as k — o, the expected loss must exceed 2.65. 
Now the first three values are fitted by 


expected loss = 3.98 + 1.30 (0.43)*. 


Wild extrapolation of this suggests that 3.98 is the limiting expected loss when 
all previous observations are included, and that we would be within 5% of the 
limiting expected loss with k = 3. In general, provided that most batches are 
good, it is plausible that little will be gained by taking k greater than about 
1/t, , the expected run length of bad batches. 

(iii) There is on all counts a substantial gain from including information 
from batches on both sides of the one being sentenced. The expected loss for the 
one-step-forward, one-step-back scheme, namely 3.05, is substantially less than 
that guessed in (ii) to be the limiting loss when all previous batches are taken 
into account. This conclusion, if confirmed, seems important. It is plausible 
that two-sided schemes are particularly desirable when one, but not both, of 
t, and ¢, is very small. 

(iv) The results in Table 3a show what happens when the true batch quality 
is locally constant, but is not necessarily equal to a or b. The main conclusion 
is that the power function for the two-sided scheme is much steeper than for 
the two one-sided schemes. 

(v) Table 3b shows, as a function of 1/t, , the mean length of a run of bad 
batches, what happens when the matrix of transition probabilities is not that 
assumed in deriving the schemes. All the schemes do, however, have ¢,/t, = 10, 
so that the prior odds that a batch is good are 10 to 1. The most important point 
here is the behaviour when good and bad batches occur randomly. We are then 
of course, worse off using a serial scheme than in using a non-serial scheme; the 
additional loss is about equal to gain obtained with the serial scheme, under the 
conditions assumed in deriving the serial scheme. Note, however, that when 
t, = 4, so that mean length of a run of bad batches is much less than that 
assumed in obtaining the schemes, it is still better to use the serial schemes 
rather than the single-batch rule. 


8. Some GENERALIZATIONS 


One modification of the above scheme that may often be necessary arises when 
there is external information to suggest that a change in true batch quality has 
occurred. For example there may have been a run of bad batches and a change 
in the process may be made after the production of B,_, that is confidently 
expected to improve quality. It would then be wrong to use 2’s before z, in 
sentencing B, . The theoretically correct procedure is to calculate the posterior 
odds that B, is bad using only z, and such subsequent 2’s that can be made 
available when sentencing occurs. 

It has been assumed above that the sample size is fixed and predetermined. 
If the sample size is varied in any way, the calculation of the sentencing rule 
is in principle straightforward, provided that the previous assumptions about 
true batch quality remain valid. Thus, if for a particular batch the sample size 
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is doubled, the Poisson means for good and bad batches are 2a and 2b, and the 
probabilities of good and bad batches are exactly as before. Of course the details 
of the calculation get very messy if there are many possibilities to be considered. 

A more significant problem is the construction of a suitable multi-stage pro- 
cedure. With a two-point prior distribution the optimum sequential scheme 
will be related to a likelihood-ratio type with fixed limits. If we are allowed to 
take occasional second samples, but otherwise must stick to a single sample 
plan, the second samples should be taken on those batches for which the final 
odds of being bad are near to the critical limit. That applies when there is a two- 
point distribution of true batch quality: If there is much material of inter- 
mediate quality, such a double-sample scheme may lead to much attention being 
paid to batches for which the final decision is immaterial. 

A final remark is that we have here characterized each scheme by a rejection 
rule, by an expected loss and by various probabilities of incorrect decisions. 
These things all vary discontinuously as the parameters of the system change. 
An alternative type of characterization is by the expected amount of information 
(in Shannon’s sense) about the true batch mean provided by the observations. 
This may be useful in a situation where it is not required to formulate a me- 
chanical rejection rule. However, any such quantity needs to be interpreted 
cautiously, because the behaviour of the schemes is too complex to be condensed 
into a single number. 
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TECHNOMETRICS Avucust, 1960 


Discussion of the Papers of 


Messrs. Hald, Wetherill and Cox 


Editor’s Note: The three preceeding papers in this issue of Technometrics: ‘The 
Compound Hyper-geometric Distribution and a System of Single Sampling Inspection 
Plans Based on Prior Distributions and Costs’ by A. Hald; “Some Remarks on the 
Bayesian Solution of the Single Sampling Inspection Scheme” by G. B. Wetherill and 
“Serial Sampling Inspection Schemes Derived from Bayes Theorem” by D. R. Cox 
were presented at a meeting in London in February 1960. After the presentation of 
the papers the following discussion ensued. 


G. A. BARNARD 


I will just make a few brief remarks about Professor Hald’s paper. The 
asymptotic result that Professor Hald has, which is confirmed by Guthrie and 
Johns, about the relationship between the lot size and sample size, is a most 
important and useful contribution to the practice as well as the theory of this 
subject. One often has the problem of interpolating in a sampling system, given, 
say, a table showing batch size 200, sample size 20, accept no defectives; batch 
size 1000, sample size 50, accept up to two defectives, and one is then required 
to say what should be done if the batch size is 500 or if it is 1500. From the 
result given it follows that one should interpolate by a formula involving some- 
thing between the logarithm and the square root. It does not finally resolve the 
question and I would like to suggest that perhaps the methods might be extended 
if possible to deal with the case of sequential plans. It seems to me here that one 
would have to consider not the average sample size, which rather comes out of 
a by-product in these schemes, but the actual producer’s and consumer’s risks. 
One would conjecture, that the consumer’s risk, at any rate, should go down 
inversely as the lot size. If the permitted number of defectives are near to zero 
this is going to be something like the correct relationship. But it would certainly 
be valuable to have some more precise notions on that point. 

The other main point I wanted to make is to draw attention to the sentence 
in Section 4 where it is said that systems of inspection such as Military Standard 
105A should be regarded simply as modes of indexing sets of OC curves. This 
seems to me a most valuable remark. It resolves a problem which has bothered 
me for a long time—to described in a short way what we mean when we talk 
about a system of sampling inspection such as that described in Military Standard 
105A. I now see that it is best regarded as a mode of indexing, where you start 
by saying that you have a certain lot size, you have a certain average quality 
level, and you want a certain level of inspection. Having determined these 
things you are led through the tables provided to adopt a specific sampling 
plan. Now you might have arrived at exactly the same plan from Professor 
Hald’s point of view if instead of starting with average quality level and so on, 
you start with the process curve and the cost factors which he takes into account. 
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These might lead you to the identical sample size and the identical acceptance 
number, but by a different route. Professor Hald’s approach, and Military 
Standard 105A could therefore be compared to an ordinary dictionary in which 
you look up words by using the front half of the word, indexed alphabetically 
from the front, and a rhyming dictionary in which you look words up starting 
at the back. If you are in fact composing a poem—and I suppose there is a good 
deal of analogy between designing a sampling plan and writing a poem—the 
kind of dictionary you use will to some extent depend on the relative importance 
you give to the meaning of what you are writing, and the metre, and the rhyme. 
But one could imagine a dictionary which indexed words by, say, the third 
syllable in the word, and this would not be much use because it would omit all 
words which had only two syllables. And in this connection I want to make a 
point. I feel that the standard methods which have been in use up to now of 
indexing sampling schemes have been faulty in using terms like the operating 
characteristic curve, and so on, to define sampling schemes, because they rule 
out consideration of the schemes which Dr. Cox is going to talk about after tea. 
In these the probability that a batch will be accepted is not uniquely defined by 
the qualtiy of the batch itself, but it is partly dependent on what has happened 
to preceding batches. In such cases one cannot talk about the OC curve in the 
ordinary sense at all, one has to introduce the concept of average run length, 
the number of batches which are accepted before a batch is rejected, or some- 
thing like that. From this point of view I think the sampling dictionaries in 
common use could well be improved. But I think this analogy of the dictionary 
does enable one to obtain some sort of provisional balance between the two 
approaches to sampling inspection which now hold the field, the one approach 
via the Bayesian system and evaluation of costs, and the other approach which 
says you specify consumers’ risk or producers’ risk, or an average run length, 
or some kind of operating feature of your plan and then find a plan to fit that. 
The two approaches I think can be compared with the ordinary dictionary and 
the rhyming dictionary. It is true that. in the last analysis no person taking 
himself seriously would do other than use an ordinary dictionary, because after 
.all the sense of what is written must always be more important than the rhyme. 
So to this extent I think the Bayesian system is ultimately the one to be preferred. 
But in the heat of the moment and the needs of the urgent present, one might 
turn in practice to the rhyming dictionary and use things like Military Standard 
105A, knowing that it is not perfect. 


D. V. LinDLEY 


I want to make just one technical remark about a point raised by Professor 
Hald in his paper. There is a theorem due to Bruno de Finetti that seems relevant 
in this connection. De Finetti’s result is as follows: If we have a stochastic 
process x, , X72, °-~- , the values of x being 0’s or 1’s, as here where we are observing 
defectives or non-defectives, and if the distribution of this stochastic process is 
such that it is invariant under changes in the numbering of the process, that is 
the joint distribution of x3 , 722 , and 2» , for example, is just the same as that of 
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a, , %, and x; , then De Finetti shows that the stochastic process must have a 
mixed binomial distribution. It is quite a deep theorem. The most accessible 
reference is L. J. Savage’s book, Foundations of Statistics, (p. 53). The theorem 
is used by De Finetti as the basis of his derivation of frequency probabilities. 
He thinks that probability is fundamentally a degree of belief notion and that 
frequency notions are derived from it. He derives frequency probabilities by this 
theorem. This seems to me to be relevant to Professor Hald’s theorem to the 
effect that the only reproducible distributions are those derived from mixed 
binomials. I cannot quite see for myself all the details but it does seem that his 
requirement concerning taking the hypergeometric sample amounts to saying 
that whatever you take from it the distribution has got to be the same—it must 


- be g,(xz)—and so he is imposing this invariance condition which is very much 


the same as De Finetti’s. If so, this result about reproducibility should follow. 
De Finetti’s result has been generalised very significantly by Savage and Hewitt, 
where they have discussed not only processes of 0, 1, type but general processes, 
and this might be of interest if one were dealing with inspection schemes where 
a measurement was made rather than a classification into effective and defectives. 

The final remark I should like to make is that I was a little surprised in a sense 
by Professor Hald’s apology for using prior distributions; and again by Professor 
Barnard’s remarks. I would have thought that Savage had just given in his book 
the reason for doing so. It pleased me greatly, of course, to see the honest use 
of these prior distributions being made today. 


G. A. BARNARD 


Since Mr. Lindley has been speaking about the prior distribution in sampling 
inspection I feel obliged to indicate what I had in mind. It derives from an 
extension of the approach that Professor Hald made, using the prior distribution, 
and in fact one can think of it if you take the prior distribution really as the 
fundamental thing, or, rather, the ‘quality distribution’, which is the prior 
distribution before you have sampled, and which is successively changed, in 
accordance with Bayes’s theorem, as inspection proceeds. If, as it were, when 
you look at the batch you do not see the batch, you see the quality distribution 
to which it now belongs. When the batch is delivered to you, it comes trailing 
the clouds of the prior quality distributions which are associated with the 
manufacturer of that batch. After you have taken a sample from the batch, the 
prior quality distribution which it started with is changed by Bayes’s theorem, 
multiplied by the likelihood function, into another quality distribution, and 
after you have taken the sample it is in the light of the new quality distribution 
that you decide what to do with it. Now the flexible scheme which I had in mind 
to put forward would take account of the fact that in real life the cost functions 
which are taken as constant in Professor Hald’s paper are not in fact constant 
over time, and that there are situations that arise in industrial practice where 
there is a very urgent demand for the goods being inspected. Other situations 
arise where the demand for these goods is less urgent, cases where the available 
storage capacity is already practically full, cases where the storage capacity is 
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practically empty. And if one takes into account these fluctuations of circum- 
stances one should allow the quantities involved in the scheme to fluctuate 
correspondingly. 

In a very simple case where the prior quality distribution was in the form of 
the two point mixed binomial centered at p, and p, , good and bad, the quality 
distributions after sampling could be represented on a line on which one plots 
the log odds for good versus bad. They will in fact be, if your prior probabilities 
are dy and a, , the initial odds will be a,/a, , and then if a sample gives r defec- 
tives and s non-defectives the posterior quality odds will be 


Siete 
GQ, \Pi/ \Qi 
as in Mr. Wetherill’s paper. Having inspected the batch, you can put up a 


little flag saying batch 101, and stand it on the point corresponding to the correct 
log-odds; 


r log (po/pi) + 8 log (qo/qi) + log (ao/a,) 


The batch is then put into store, labelled accordingly. Now suppose another 
batch is delivered. This will after a preliminary sample, be attached to another 
flag, and so after a period of operating this scheme you have flags strung along 
the line of log-odds. Now at any given stage your inspectors may perhaps be 
idle, and in that case it would seem sensible that they should be asked to get 
out of the store the batch which is near the middle of the line and told to take a 
further sample from that batch with a view to moving the flag either to the 
right or to the left, to get further information about it, to be more certain that 
it is good, or more certain that it is bad. On the other hand, if the action called 
for is not further inspection but decision to deliver batches into the factory, 
then if the factory orders two batches, which must be delivered tomorrow, the 
two batches you take are just those two which you think most likely to be good. 
If on the other hand the van is present and it is able to take back to the supplier 
some bad batches, if it has space for one, then you take just the worst batch; 
or you may send the worst two batches, according to circumstances. And so 
you use the picture of the posterior quality distributions to guide you on what 
to do in current circumstances. 

One can get a similar sort of approach to inspecting by variables where the 
posterior distributions would not be represented by a single number representing 
the log odds, but commonly one could take as a reasonable approximation to 
the quality distribution a normal distribution specified by a mean and a variance 
and then you would have to put flags on a two-dimensional board. I shall suppose 
the axis for mean quality is horizontal, while the variance axis is vertical. Then 
one would have to draw roughly semi-elliptical contours of constant risk, and 
as one learned more about batches their representative points would move up- 
wards, and to the right if quality is high, to the left if quality is probably low. 
In operation, one uses the chart as before, not tying onself to any hard and fast 
rule. People familiar with the power politics and struggles that go on between 
the inspection and production departments will realise that there are dangers 
attached to this suggestion. 
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B. Hitt 


I want to make two remarks. First, I think a good many people here will 
recognise as something I have said quite frequently before, but nevertheless I 
think it is still worth another mention. Of course we have not heard anything 
about it this afternoon, and that is that one of the objects of a sampling plan— 
not necessarily always, but in many circumstances—can be not so much to try 
to sort out the good batches from the bad ones, as to remind people that quality 
does matter and to give an incentive, in the hope that the very existence of the 
sampling plan will change the prior distribution. I think I am right in saying 
that all we have heard this afternoon does depend upon the existence of a fixed 
prior distribution. This alternative point of view is one that I consider to be 
very important. I am not saying that these economic schemes are useless by 
any means. If we do believe we have got a fixed prior distribution, and what 
we want to do is to minimise the total costs, it is obvious that we must proceed 
on lines such as this. There may be cases when we want to do those things, but 
there are many cases when this is not the particular point of view that we should 
seek. Professor Barnard in particular was saying that the American Military 
Standard, which I admit has many arbitrary features, was a less exact way of 
getting at things than using the economic criteria. So it is, if what you want is 
to fulfill these economic criteria. I am not convinced that this is always the 
answer, and I think that the American Military Standard, although it has many 
arbitrary features has also a great many useful features, and from the point of 
view I am considering it is difficult to see how you can get completely away 
from choosing things arbitrarily to some extent. One particular thing that is 
very arbitrary in the American Military Standard is the relationship between 
lot size and sample size. I am very interested in Professor Hald’s equations show- 
ing that depending on the conditions either the square root or the logarithm of 
the lot size is a good thing to take and I should like to ask him, because I am not 
quite sure for myself, whether either one or other of those relationships or a 
compromise between them is a good thing anyway, even if you are picking 
sampling plans somewhat arbitrarily on the basis of choosing points on the 
operating characteristic or do those relationships apply only if you are picking 
schemes on the economic criteria that he has put forward. 

Finally I would like to say just a word on what I might call contemporary 
history because I know that some of you know that I have been busy in putting 
forward some rival proposals to the American Military Standard, but on basically 
very similar lines to the American document. One of the things that had to be 
done with the document that we have been producing in this country was to 
send it to the Americans for their comments and one of the comments that they 
made on our document was that it would be a good idea to set up an inter- 
national working party to see whether we can get the best of both worlds or 
perhaps something better than either of them. This international working party 
on which the United Kingdom, the United States and Canada are represented 
has now been set up, and a first meeting was held in Washington last month. 
From the results of that meeting it looks as though we may be able to produce 
a document still on the same basic lines as the American Military Standard, 
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but varying quite considerably in the details and I hope that we are going to 
produce something better than either the existing American one or the proposed 
British one. How many years it will take to do so I do not know. These things 
seem to take longer than in fact you think they are going to at first sight. 


I. J. Goon 


May I ask if there is a condition on the prior distribution in connection with 
the sample size—lot size relation. If it is discontinuous in the neighborhood of a 
particular point—is that what decides whether you take root n or log n in the 
asymptotic formulas of the sample size? I imagine that what matters is whether 
there is a discontinuity at the critical point. 


G. HorsNELL 


I would just like to point out that this relationship between the sample size 
and the logarithm of lot size also holds in the scheme which I put forward some 
three years ago and also I think a student here at Imperial College, Dr. Vagolkhar, 
found the same relationship. So it is not only applicable to the schemes that 
Professor Hald has put forward. It does seem to me to suggest that this is almost 
universely true when we are considering the economics of sampling inspection 
schemes. The other point I would like to make is that the type I curve or the 
Beta distribution, as perhaps it is more generally known, has proved useful in 
graduating process curves. I should say perhaps that it has proved useful in 
graduating the sample results from process curves. It is of course not possible 
to infer from a successful graduation of the sample results that the underlying 
process curve is a Beta distribution. In fact one gets good graduations using 
the negative bionomial. The underlying process curve then would be the type 3 
of the y distribution. But I have a couple of examples where assuming the under- 
lying distribution to be type I does give a reasonable graduation to sampling 
results. 

In connection with Dr. Cox’s paper I am very interested in this theory, and 
I know he would be able to generalise it to the case where, as in Mr. Hill’s and 
Dr. Warner’s and my own work, we look at sample results and we decide to 
re-sample, that is to take a second sample from those batches which are grouped 
between batches which show a defective at the first sample. This of course does 
not extend to the case where the sample immediately prior to the batch re- 
sampled shows no defective at all. It is perhaps a rather specialised type of 
dependence, more specialised perhaps than Dr. Cox was putting forward today. 


Dr. Horsnell subsequently added in writing: 


I put forward the following Deferred Sentencing Scheme several years ago 
at the Ordnance Board. It has the desirable property of an upper limit to accept- 
ance probabilities under completely random conditions, which is in fact given 
by the O.C. curve of a simple batch-by-batch scheme. 

The scheme is designed to accept batches of 1% defective quality on over 
95% of occasions on which they are offered and to reject batches of quality 
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worse than 1% defective as frequently as possible, and the specification is as 
follows: 

A first sample of 10 items is taken at random from each batch and inspected. 
A batch is then either accepted or a second sample of 20 items taken at random, 
is inspected as follows: 

(a) a second sample from a batch is inspected if either 
(i) its first sample contains 1 or more defectives, or 
(ii) its first sample has 0 defectives and the batch is included in a sequence 
of 5 batches or fewer with a combined total of 2 or more defectives in 
their first samples. Such a sequence must begin and end with a batch 
with at least 1 defective in its first sample, and a second sample is taken 
from every batch in the sequence. 
(b) If condition (a) does not hold the batch is accepted immediately. When a 
second sample is inspected, the batch is rejected or accepted on the results of the 
second sample alone as follows: 


(c) the batch is rejected if either 
(i) its second sample has 1 or more defectives, or 
(ii) its second sample has 0 defectives and the batch is included in a sequence 
of 5 batches or fewer with a combined total of 2 or more defectives in 
their second samples. Such a sequence must begin and end with a batch 
with at least one defective in its second sample and every batch in the 
sequence is rejected. 
(d) If condition (c) does not hold the batch is accepted. 

Under completely random conditions of batch quality the above scheme yields 
acceptance probabilities which are no greater than for the batch-by-batch 
acceptance sampling scheme: 

First sample size 10. Accept if 0 defectives, take a second sample if 1 or more 
defectives. 

Second sample size 20. Accept if 0 defectives, reject if 1 or more defectives. 

The Acceptance Probabilities are as follows: 


Deferred sentencing scheme 
% defective Constant batch Upper Random batch 
in batch quality quality 


10 0.05 0.43 


Thus it seems that for a slightly reduced chance of accepting batches at 1% 
defective a very much steeper O.C. curve is obtained using the deferred sentencing 
scheme clauses while retaining an upper limit to the acceptance probabilities 
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under completely random conditions. It would be possible to construct similar 
schemes with this desirable feature. 


D. R. Cox 


Could I just reply to that last point. It was dealt with in part in the end—in 
the part that I didn’t read out. In principle there is no difficulty at all if you 
have any system of sample size whatever. If for instance on one particular batch 
it happens you have taken three times the usual sample size, it simply shifts 
the posterior probability correspondingly in a way that can be easily worked 
out. And that can be done for a situation in which the sample size is varied 
arbitrarily from batch to batch. There is no difficulty in principle although it 
could get very complicated. I think the very interesting question that raises 
is the extension to a yet more complicated version of the suggestion that Professor 
Barnard has raised, because one could do this when posterior odds that a 
batch is bad are assigned to the batch, but the posterior odds depend now 
not only on the particular batch in question but also on observations on neigh- 
bouring batches. One has the rather complicated sequential situation where you 
can improve your information about this batch by taking observations not only 
on this batch but also on adjacent batches. And so you may have the situation 
where it may perhaps not be clear which further batch observation should be 
taken on, because it may be that it is perhaps particularly desirable to get more 
information about that batch, but it pays to look instead at batch no. 100, 
because that gives you also more information about batch 99, which you are 
also mildly interested in. It raises the same sort of strategic question, only in a 
more complicated form because by taking observations only on one batch all 
the points are affected to some extent. 


F. J. ANSCOMBE* 


Professor Hald has put a great deal of work into this paper. It may well be 
several years before full advantage will be taken of all the material on compound 
hypergeometric distributions and on the cost of single sampling plans under 
various types of distribution of lot quality. The results here presented and the 
tabulations promised will be valuable for research as well as practice. In the 
asymptotic results of Section 9, Professor Hald has been anticipated by Guthrie 
and Johns, but his less formidable mathematical apparatus will no doubt make 
the results more generally accessible. 

During the short time available I have thought seriously only about the fram- 
ing of the problem which Professor Hald has studied, and not about the results 
obtained. Had this paper appeared ten years ago, it would have been most 
original and could have elicited nothing but praise. But by now the literature 
of economic analysis of inspection is sizable, and we are no longer pioneers. A 
rather high standard of precision and perceptiveness can be expected in new 
work. At two points I have been surprised by what has seemed an unduly rough- 
and-ready treatment. 

The cost functions proposed in Section 6 seem to be intended to represent a 
considerable range of inspection situations. (The diagram at the start looks 


* Professor Anscombe’s comments were prepared beforehand and read at the meeting. 
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comprehensive; and there is a phrase about previous authors who “have con- 
sidered only special sampling plans.”) But I find I am quite unclear as to pre- 
cisely what situations Professor Hald has in mind. His cost functions are certainly 
reasonable for rectifying inspection, where the inspection is nondestructive, all 
defectives found are replaced by good articles, and “rejection” is a misnomer 
for “inspection of the balance of the lot.” Are they reasonable for anything else? 
Breakwell, to judge from his examples, has in mind destructive inspection of a 
sample which is additional to the lot, coming from the same unlimited source; 
the sample size could conceivably exceed the lot size. His loss function is essen- 
tially different from Professor Hald’s. Suppose we had to consider destructive 
inspection of a sample which was taken out of a limited lot. It would probably 
now seem appropriate to reckon that the cost of sampling depended on the 
quality of the lot as well as on the size of the sample, because the value of the 
net product would be reduced more by taking out and destroying the sample if 
quality were good than if it were bad. Now neither Breakwell’s nor Professor 
Hald’s loss function will do (though Guthrie and Johns’ loss function covers 
this case, as well as the rectifying case). Horsnell has considered inspection made 
by a purchaser, whose successive transactions are interdependent, in Schlaifer’s 
phrase; and yet another loss function is encountered. There are indeed many 
different inspection problems that can arise, and small variations in the circum- 
stances are liable to prove important. It would be helpful if Professor Hald 
could be more specific about the application of his loss function. 

The other curious feature of this work is the restriction to single sampling. 
Admittedly, when inspection involves a lengthy test, such as a life test, there 
may be a large component of cost in multiple sampling caused by the delay in 
reaching a decision. But normally this type of cost is absent or slight, and then 
sequential plans merit serious consideration. Now Moriguti and Breakwell, 
in considering their problem, seem to have found not much difference in losses 
between sequential plans and single sampling plans. On the other hand, for 
rectifying inspection I have found that ‘he average loss (regret) for a well- 
chosen linear sequential plan is something like a half that for the best single 
sampling plan, provided the lot size is large enough for the break-even number 
of defectives initially present in the lot to be moderately large. To halve the 
losses may not be a spectacular improvement, but it ought to be perceptible. 
The relative advantage of the sequential plan over the single sampling plan 
increases slowly as everything else gets large, and therefore the asymptotic 
results of Professor Hald and Guthrie and Johns are less interesting than they 
might appear at first glance. 
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I would like to make a few comments on some of the questions raised. First 
with regard to Dr. Wetherill’s paper, I like this approach of reducing the number 
of parameters in the mixed binomial very much. The general solution for the 
mixed binomal with an arbitrary number of components has been given in my 
paper, but it is rather complicated, so it is a very good way to reduce this compli- 
cated solution to a handy one by suitably restricting the parameters. I suppose 
there is still sufficient flexibility in the resulting distribution of Dr. Wetherill’s 
paper so it will be possible to represent many prior distributions by means of 
this mixed binomial. 

I also liked Professor Barnard’s allusion to the dictionary very much and think 
it is a good comparison. There could be said much for and against the Military 
Standard. I am not against the Military Standard. I very much hold the same 
point of view as Mr. Hill on the Military Standard. As you say, the proof is in 
the eating of the pudding. The Military Standard is the one accepted all over 
the world, and it is functioning rather well according to my experience. But on the 
other hand I think that it is essential for further progress in this field, that the 
conditions for choosing the different sampling plans should be more specific 
than in the Military Standard. The Military Standard is best regarded as a 
general reference which one can choose when one has nothing better. 

With regard to the usefulness of the 6-distribution, as a prior distribution, 
I have recently got some data from hundred-per-cent inspection and it seems 
that the j-shaped 6-distribution is very good in these cases. That means that the 
prior distribution may be represented by just one parameter. If that is a uni- 
versal feature for many components from mechanical industries, as it seems 
from these data, it will be easy to specify the prior distribution by just quoting 
the acceptable part of the prior distribution, or just by giving the average of the 
prior distribution, that is the one parameter that is necessary. 

I don’t think my paper is as vague as Professor Anscombe indicates with 
regard to stating the assumptions it is based on. The assumptions with regard to 
costs and prior distributions are pretty clearly stated but I admit that there 
are some defects in regard to telling in what industries and in what cases such 
costs or prior distributions occur, but if only the basic assumptions for a system 
is stated in the paper I think it is up to the man applying the system to find 
out whether the assumptions are fulfilled in his own case. But clearly, as Professor 
Anscombe writes, there is room for much further development. One of the 
problems I considered to some extent is to make the costs depend on the fraction 
defective in the lot. That naturally complicates the whole thing very much, so 
in the first instance I found it sufficiently difficult to use constant cost param- 
eters. Also the application of this system to sequential plans would be very 
desirable. Another extension would be to develop a similar theory for prior 
distributions where the number of components in the mixed binomial is a func- 
tion of the lot size. 


Professor Hald subsequently added the following remarks in writing: 


Mr. Hill’s first remark points out that the sampling plan chosen gives the 
producer an incentive to change his prior distribution whereas the theory 
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presented here assumes a fixed prior distribution. To my knowledge all other 
systems of sampling plans rest on this same assumption, and changes in the 
prior distribution—observed by studying the inspection records—affect the 
choice of future sampling plans, i.e. as in the Mil-Std-105 by changing to reduced 
or tightened inspection. As I have remarked in the introduction it would be most 
desirable to have such a feed-back mechanism built into the system but until 
this has been achieved one may get along reasonably well by estimating changes 
in the prior distribution by some simple statistical method and then change the 

optimum plan correspondingly. . 

With regard to Mr. Hill’s second remark I think that you could get any 
relation between lot size and sample size on the basis of choosing points on 
the operating characteristic. It all depends on how the risks chosen are made 
to depend on N. 

Dr. Horsnell’s empirical evidence regarding the form of the process curve 
points in the direction of the beta-distribution and so does mine. This will, 
however, lead to a sample size proportional to the root of the lot size on the basis 
of my cost equation. 

Mr. Lindley is right in pointing out that De Finetti’s result in some way 
must be related to mine. Also Mr. Brgns has drawn my attention to a book by 
Frechet [25] which seems to contain further interesting results in the same 
direction. 

In answering the questions of Dr. Good and Professor Anscombe I should 
like to comment beiefly also on the very interesting paper by Gutherie and 
Johns [26] which contains asymptotic results similar to mine. 

Guthrie and Johns have limited themselves to studying asymptotic solutions 
only but have on the other hand been able to derive their results under more 
general conditions regarding the prior distribution than mine. Their basic 
assumption regarding the distribution of the number of defectives or defects in 
the sample (and also in the remainder) for given lot quality is that this distri- 
bution (apart from a scale factor) is binomial, negative binomial, or Poisson 
(their class denoted by ¢$,). They then combine these distributions (written 
in 2 common form) with two classes of prior distributions: 

G: consisting of all cumulative distribution functions which are twice con- 
tinuously differentiable in some open interval about the break-even quality 
and with a positive derivative in that point. 

G2 consisting of all cumulative distribution functions which assign zero prob- 
ability to values of the parameter (lot quality) within a certain interval con- 
taining the break-even quality and assign finite probability to the end points 
of this interval. 

G, and G, corresponds to the continuous and discontinuous weight functions, 
respectively, of my mixed binomials. 

In both cases they find that the acceptance number asymptotically is a linear 
function of the sample size. For the class G, they find n ~ +/N whereas for 
G. the result is n ~ log N. 

Thus, Guthrie and Johns paper contains a more precise answer to Dr. Good’s 
question than mine. 

Professor Anscombe is somewhat worried over the limitation of my cost 





372 A. HALD 


function. I’ must admit that I purposely tried to reduce the number of cost 
parameters as far as possible for obvious pedagogical and practical reasons. 
Professor Anscombe refers to Guthrie and Johns paper which contains a cost 
function with 6 cost parameters as compared to the 3 in my paper. It is, however, 
rather easy te show, that Guthrie and Johns expected cost can be written as a 
linear function of my K(n, c) just as I have shown in section 6 how my model II 
can be reduced to model I. 

Combining Guthrie and Johns cost notation with my random variables we 
get the following two expressions for the costs associated with acceptance and 
rejection, respectively: 


8c + sn + a, (X — x) +a, (N — n) 
and 
st + sn+r, (X — 2) +7, (N —7n). 


Putting s, = a, = r, = 0, and a, = 1 we have the simplified cost function 
which is my model I. From the point of view of determining the optimum 
sampling plan, however, it is not the structure of the above expressions which 
matters but only the structure of the average cost. Performing the operations 
leading to the expected cost as shown in section 6 we find 


K;(n, oc) = snp + sn + (N — n)(nip + 12) 


+ (N — nda — 7) © osta)(p.(@) — 2%). 


r 
Defining 
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k, ie 82 — Ae + B(s = 11) 
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we have 
K;(n, ec) = (a, — 7) {N6 + nk, + (N — n)(k, — ym, o)} 
= (a, — 1r,) 6N + (a, — 7) K(n, oc) 


where y(n, c) and K(n, c) are the functions defined by (100) and (99) or (96). 
The optimum plan thus depends on the two fundamental parameters k, and 
k, which in turn are functions of the 6 cost parameters and j. It therefore seems 
to me that for mathematical and numerical convenience it is preferable to 
keep to my model I and reduce other linear models as shown above. 
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Variations Flow Analysis 


Norsert L. Enrick* 


Institute of Textile Technology 
Charlottesville, Virginia 


Variations Flow Analysis is a technique of evaluating the transfer of variations 
in stock, when the product from several machines at one processing stage is fed 
randomly to the several machines of the succeeding stage. This paper describes 
procedures, based on modifications of range methods for analysis of variance, which 
have been found of value in a large number of applications. The methods are illus- 
trated with examples from yarn manufacture. 


1. INTRODUCTION 


Variations Flow Analysis represents a special adaptation of analysis of vari- 
ance technique, for tracing the transfer of variations in stock from process to 
process in multi-machine and multi-stage processing. By tracing this flow, and 
comparing actual variations against expected values, it is possible to isolate and 
correct any places where excessive variations are being induced into the pro- 
cessing sequence. Measures leading to corrective action can then be concentrated 
on the places so isolated. 

The techniques developed permit handling of special processing situations, 
in which the flow of stock is complex, such as the following: 


1. Several machines in one processing department feed at random into a 
group of multi-spindle machines in the next processing department. 

2. The product is blended at various processing stages. 

3. The product is attenuated at various processing stages. 


The analysis procedures have been simplified, so that the average supervisor 
in the industrial plant will understand them sufficiently to make his own appli- 
cations. This feature of the procedures has been verified in actual practice and 
checked in industrial training courses over the past five years, in which attend- 
ants representing close to one thousand industrial supervisors were given sample 
problems to work out. 

The techniques described here are in actual use in over a hundred plants, 
representing spinners and weavers of cotton, woolen, worsted and synthetic 
textiles. However, there are many other industrial processing organizations, 
particularly in the foundry and in the chemical plant, where these techniques 
of Variations Flow Analysis should be applicable in a similar manner. 

Thus, even though the discussion in the following may refer to within-machine 
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Figure 1—Simplified Flow Chart of Stock in a Spinning Mill. 


and between-machine variation in a spinning process, parallel application in 
other industries, such as to variations within and between lots in Bessemer steel 
rolling become apparent. In describing the latter type of application, Weaver 
(14) deplores the difficulty of explaining variance analysis by means of con- 
ventional sums-of-squares techniques, a problem which the procedures in this 
paper seek to avoid by the use of methods involving ranges. 


2. THE MopEL 


The general model for which the Variations Flow Analysis techniques de- 
scribed here were developed is as follows: Production, from raw material to end 
product, involves several processing stages. In each stage, the product from one 
set of machines feeds at random into the several machines of the succeeding 
processing stage. At any stage, a blending may occur, whereby several units 
of the product are combined. A typical production process of this nature is 
illustrated by the flow chart in Figure 1, showing the manufacture of yarn in a 
typical spinning mill. The product from one set of machines feeds more or less 
at random into the machines of the next processing stage. Production is of the 
semi-continuous processing type. Strands of product are extruded on a con- 
tinuous basis on various spindles of one process; and when a package of several 
pounds has been produced, it is removed and then fed into the next machine. 
By means of successive attenuation of the stock, a rather bulky sheet of stock 
coming from the picker machines is gradually reduced to a fine thread of yarn 
in spinning. 

At any processing stage, a special type of blending or ‘“‘doubling”’ may occur, 
wherein several strands are combined into a single new strand. Usually, there 
is at least one such blending, which occurs in the drawing process, and is illus- 
trated in Figure 2. Here six strands are fed into the drafting rolls, where they are 
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Figure 2—Drafting and Blending of Stock in a Drawing Frame, Schematic Top View. 


blended together. At the same time, attenuation by drafting takes place, which 
is usually large enough so that at any one output position of the frame, the 
individual strand of blended sliver produced has approximately the same weight 
per yard as each of the several slivers fed. It is apparent that the physical opera- 
tions of blending and drafting are equivalent to the arithmetical process of 
adding and then averaging. Therefore, one of the results obtained is a reduction 
in variations of stock, in accordance with the well known standard error of the 
mean effect. This statistical effect may have been recognized in intuitive form 
by Sir Richard Arkwright, when he developed the drawing frame (8). 

In a typical mill, there may be from two to ten pickers, feeding more or less 
at random into from a hundred to three hundred cards. Each of these machines 
has one feed position and one output position. There may next be ten to thirty 
drawing frames, each with four input and output positions, followed by ten to 
thirty roving frames, and from fifty to three hundred spinning frames. A roving 
frame may have from eighty to 120 spindles, and a spinning frame may have 
from two hundred to four hundred spindles. The differences in number of units 
in each process is accounted for by differences in production rates employed. 

Depending upon the degree of diversification of a mill, there may be various 
lines of flow, with a particular raw material going over a particular set of machines 
to produce a particular yarn style. The Variations Flow Analysis must, of course, 
be carried out separately for each line of flow. 


3. SUITABLE MEASURE OF VARIABILITY 


In deciding upon a suitable measure of variability, consideration must first 
be given to the basic unit of measure. This unit, as obtained from tests known 
as sizing tests, is the stock weight or more specifically the linear density, usually 
expressed in such terms as “grains-per-yard” for sliver, and in the rather complex 
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term “number of 840-yard lengths per pound”’ for roving and yard. In listing 
the weight values for various yarns alongside the standard deviations, and re- 
peating this for rovings and slivers, the author has observed that in general the 
standard deviation increases in direct proportion to increases in linear density. 
Thus, it appears that the coefficient of variation is a good measure of variability, 
since it permits ready comparison of different yarns, rovings or slivers, without 
need to refer to the particular weight involved. In the analysis shown here, the 
coefficient of variation, v, in per cent is used; however, it is apparent that the 
same analysis could be carried out in parallel manner in terms of standard 
deviations. 

In tracing the flow of variations through successive processing stages, the 
following variation coefficients, in per cent, have been found to be of importance 
in each processing department: 


1. Within-machine variation coefficient, v,, . 

2. Between-machine variation coefficient, v, . 

3. Departmental-overall variation coefficient, v», which in view of the generally 
normal patterns of v,, and v, is the Pythagorean total: 


m= Vw +9 (1) 


An effective way of presenting the relationships among these three measures 
to non-statistical personnel has been found to be by means of the frequency 
distribution patterns shown in Figure 3. 

Where a machine has several output positions or deliveries, such as a drawing 
frame, v,, may again be broken up into within-delivery and between-delivery 


components. However, where this has been done by the author, it was usually 
found that the within-delivery component was of relatively minor magnitude. 


PROCESS VARIATION IS COMPOSED OF TWO MAJOR PARTS ~- 
WITHIN MACHINE VARIATION ANDO BETWEEN MACHINE VARIATION 


AVERAGE LEVEL 


OVERALL 
/ PNERAGE 
7 LEVEL 
MACHINE 
NUMBER 
/ 
~ / 
/ 
‘ 


CURVE REPRESENTING 
PATTERN OF VARIATION 
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EACH MACHINE IS SHOWN TO HAVE THE 
SAME PATTERN OF VARIATION (WITHIN 
 // ovERALL AVERAGE LeveL // eee 
/ / 
‘ HOWEVER, DUE TO DIFFERENCES IN 
Pio MACHINE VARIATION INDIVIDUAL AVERAGE LEVELS BETWEEN 
= BETWEEN MACHINE VARIATION—/ / THE MACHINES, THE OVERALL 
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Figure 3—Process Variation is Composed of Two Major Parts, Within-machine Variation 
and Between-machine Variation. (Reproduction from Enrick, N. L. “Quality Control,’”’ Fourth 
Ed., 1960; permission of Industrial Press, New York.) 
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The analysis can therefore be simplified considerably by ignoring these smaller 
components of variation. 

Usual sizing test practice in a mill is to weigh 5-yard lengths of sliver, 12-yard 
lengths of roving and 120-yard lengths of yarn. Since the mill usually has a large 
amount of such sizing test data available, taken for the primary purpose of 
controlling average stock weights in processing, it appears logical to use these 
same weighings also for analysis of variations. However, variance length curve 
analysis of textile strands, has shown that the observed variation along a strand 
varies, being largest for small lengths and decreasing asymptotically as the 
lengths increase (13). Fortunately, for the range of weights normally encountered 
in textile processing, the conventional test lengths used in a mill, fall within the 
asymptotic area of the variance length curve. Thus, the use of existing mill data 
in Variations Flow Analysis appears justified. The practical conventions were, 
of course, developed long before the statistical methods of variance length 
analysis, and related autocorrelation and correlogram techniques had arrived. 

The case of sampling lengths, just as the prior case of blending, shows how the 
practical man can apply statistical principles correctly to mill problems, even 
though the theory of statistics as such has not as yet provided means to analyze 
such problems. There is, however, another field in which the practical man, and 
the practical mill production supervisor in particular, seems to have been less 
successful. This is in the field of realizing the effects of machine to machine 
differences. In particular, the supervisor may have been exceedingly good at 
setting up each of the machines in each department to produce good quality, 
with good uniformity. However, between machines, there may exist certain 


small but cumulatively important differences in roll diameters, gearings, 
settings and tensions, contributing to differences in weight between slivers, 
rovings and yarns. In the final assembly operation, weaving, these differences 
will then affect fabric appearance quality. Variations Flow Analysis will usually 


bring out these differences, as well as their likely causes, thus spurring correc- 
tive action. 


4. FLow oF VARIATIONS 


From the general nature of textile processing as described, and the types of 
variation in weight found to be of importance, certain basic patterns governing 
the flow of variations from process to process become apparent. These patterns 
may be stated in the form of four Flow Rules, which have been numbered, so 
that distinct reference can be made to each in subsequent discussions in this 
paper. The Rules are: 


Flow Rule 1: Since generally several machines at one process feed more or less 
at random into the machines of the subsequent process, the departmental overall 
variation of the first process is the input source of the within-machine variation 
of the next process. 

Flow Rule 2: The within-machine variation of a particular process is the combined 
result of the input from the prior process and the effects of any additional variation 
that may have been introduced by the machines themselves. 
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Flow Rule 3: To the extent to which differences in average level between machines 
exist, they thereby contribute to departmental overall variation. The larger this 
between-machine variation, which is superimposed on within-machine varia- 
tion, the greater will be the departmental-overall variation. 

Flow Rule 4: Where channelling (or non-random flow) of stock occurs, the depart- 
mental overall variation may not show up immediately in the subsequent within- 
machine variation. Instead, it will then appear in the corresponding departmental 
overall variation of the process. For example, if there are differences in the 
average level between two groups of carding machines on the same stock, but 
each group channels into a distinct group of drawing frames, then this between- 
group difference on the cards will not show up in the within-machine variation 
in drawing; but it will show up as between-machine and therefore departmental 
overall variation in drawing. This fourth Flow Rule may be considered a special 
case of non-random flow, not covered by the first Flow Rule. 


It is evident that from the configuration of variations data in a mill, one may 
predict the existence of channelling without ever having set foot in the plant. 
This, in fact, has been done successfully in numerous cases. It should be noted 
here that channelling, as described above, is always detrimental where the 
process into which the stock is channelled includes blending or doublings. In 
simple terms, if the somewhat heavy stock from one group of machines is not 
blended with the somewhat lighter stock from the other group, then the “evening 
out” of differences cannot take place. In more technical terms, channelling of 
stock, by its selectivity or non-randomness, interferes with the (previously 
discussed) standard error of the mean effect in reducing variations. 

Important improvements in variation, and greatly enhanced overall benefits, 
have accrued to mills that have recognized and eliminated channelling and its 
detrimental effects. This is especially important since differences in weight or 
linear density of stock may represent corresponding differences in such factors 
as degree of parallellization of the fibers, effectiveness of combing and carding 
action on the stock, and efficiency of removal of non-spinnable waste, such as 
short fibers, leaf and trash. 


Allowing for Blending 


It has already been demonstrated that the process of blending or doublings, 
accompanied by compensating draft, is equivalent to totaling and then averag- 
ing. Accordingly, by recourse to the standard error of the mean, and the assur- 
ance of the Central Limit Theorem (15) in case of (usually slight) deviations 
from normality, we may write the expectation formula: 


Vwe = Vo,/n* (2) 


where v,, represents the within-machine variation expected at a given process, 
Von represents the departmental overall variation of the prior process, and ” 
equals the number of strands combined by blending in the given process. 

This formula for the effect of blending of doublings has been found to work 
well in practice. However, a small ailowance is usually added to the value of 
v.- to take account of the effects of drafting during blending. 
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Allowing for Drafting 


Where drafting occurs without blending, the expected effect of draft might 
be expected, from analogy with (2) above, to be written: 


Voe = (Wop) (draft)? (3) 


Where draft refers to the linear density of the stock fed divided by the linear 
density of the stock produced. Unfortunately, this formula does not work out 
well in practice. The reason is that, unlike the case of blendings where various 
strands of stock are combined at random, drafting without doubling is a non- 
random process along linear lengths of a single strand. The lengths exhibit 
generally a high degree of auto-correlation and non-normality. Thus, since 
there is absence of both random sampling and normality, the Central Limit 
Theorem becomes inapplicable. In place of Equation (3), it has been found 
necessary to set up special tests to evaluate the effects of draft, or to utilize 
empirical allowances (4, 7). 


5. CALCULATION MretrHop 


From the constant need to control stock weight in a mill, abundant test data 
are usually on hand for the analysis of the variations in stock. Table I shows 
an abbreviated* set of such data, representing randomly selected machines 
from each of which four random bobbins were tested. The following ranges 
were obtained: 


1. The range within each machine, R,, . 

2. The range-overall, representing the range on any particular day across 
any row of test results, Ry . 

3. The daily cross-range of the machine averages, R, . 


From Table II, the calculation of v,, is clear. The estimate of vo is shown by 
Methods 1 and 2. While Method 1 is statistically the less efficient, it has the 
advantage of being more readily understood by foremen and other nonstatistical 
personnel. In particular, the calculation steps of Method 1 are parallel to the 
steps in determining v,, , and can thus be grasped intuitively; while on the other 
hand the additional steps of Method 2, shown in Line 10 seems to present con- 
siderable difficulty to intuitive explanation. 

Any excessive between-machine variation could now be found from the squared 
ratio of the average of the cross ranges to the average within-machine range, as 
shown in Equation (15) in a later part of this paper. However, for purposes of 
direct comparisons of coefficients of variation from process to process, as illus- 
trated in Table V, it has been found desirable to use a test involving the ratio 
of #) to 6, directly. This is the modified F-ratio, which for the example at hand is 
found to be: 

Poa = 60/b0 = 5.89/3.97 = 1.36 (4) 


* Nore: The 48 test results used in the illustration here represent a relative minimum 
of data. Usually, twice as many or more test results are readily available in the mill, and 
should be used, so as to minimize the loss in precision of the range method as against the 
sums of squares approach. The relative efficiency of the range in estimating the standard 
deviation has been the subject of prior investigations (5, 10, 11). 
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Tastes I 
Test Resulis on Spinning Bobbins. Data in Terms of Yarn Number* 


Range, Range of 


Machine Number Overall Averages 


_ 
oO 


SESS 
orr & 


Total 
Average 
Range Within 


on 
wm 2 
m CO bo 


Machine Number 


Total 
Average 
Range Within 


Machine Number 


Total i 225.9 
Average 2 56.5 57.6 
Range Within ‘ 2.4 3.5 


Grand Total: 2639.8 Grand Average: 55.0 


* Yarn Number represents the number of 840 yard lengths of yarn per pound weight. 


From Table III, this ratio is found to be greater than the minimum F,,,, ratio 
of 1.133 required for significance at the 5 per cent level, corresponding to r = 4, 
k = 12, thus showing that the overall variation is significantly greater than the 
within-machine variation. This indicates the presence of excessive between- 
machine differences, causing significant differences in average weight of stock 
coming from different frames. The causes, be they in roll weighting, tensions, or 
gearing, need correction so as to standardize the effective drafts among all 
frames. 

More formal variance analysis procedure, using conventional sums of squares 
techniques, would have yielded slightly different estimates of variation co- 
efficients and again a significant F-ratio for the excess of departmental overall 
over within-machine variation, as shown by the data in Table IV. 
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Taste II 


Calculation of Estimated Variation Coefficients, Range Methods, 
using Test Data from Table I 


(Carets ~ Denote Estimated Parameters) 


Departmental Overall 


Within 99 —~———---———- 
Computation Steps Machine Method 1 Method 2 


. Total of Ranges 54.2 73.6 12.7 
. Number of Ranges 12 3 
. Average Range, (1)/(2) 4.517 i 4,233 
. Sample Size r=4 Nl. = 4 
. Conversion Factor* 2.07 i 2.12 
. Standard Deviation, ¢ (3)/(5) 2.182 ; 1.997 
. Quantity Estimated by o? 
. Variation Coefficient, 6 % 

100 X (6) + Grand Ave. of 55.0 
. Symbol 
. vo from Range of Averages** 


V de® + (1 — 1/r)b.? 


*From David’s table (2, 11) as extended by Duncan (3), based on the Number of Ranges 
and Sample Sizes shown in Lines 2 and 4 above. When the number of ranges used exceeds 
ten, Tippett’s d, values will not introduced appreciable bias. 

** Based on Equation 13 in this paper. 


Taste III 
Values of the Modified F-Ratio, (Fmoa)* 
(5% Probability, 95% Confidence Level). 


No. of Tests 
per Sample, Number of Samples, (k) 
Sample Size 

(r) 10 12 14 15 18 20 25 


1.453 1.393 1.351 1. 1.293 1.272 
1.223 1.197 1.178 1. 1.151 1.141 
1.150 1.133 1.121 1. 1.103 1.097 
1.1138 1.101 1.092 1. 1.078 1.074 
1.091 1.081 1.074 1. 1.063 1.060 
1.066 1.059 1.053 1. 1.046 1.043 
1.052 1.046 1.042 1. : 1.034 


* Calculated from Equation 11, and based on the reduced Degrees of Freedom tabulated 
by David (2) and extended by Duncan (3) to allow for the approximately ten per cent loss 
involved in the use of ranges. The resultant degrees of freedom involve decimals. The ap- 
propriate values of F were obtained by interpolation of the five-place tables of Maxine Mer- 
rington and Catherine M. Thompson (Biometrika, 33, p. 80). 
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Taiz IV 
Analysis of Variance for Spinning Tests, Sums of Squares Method 


Sum of Mean Quantity Estimate Variation 
Source of Variation Squares Degrees Freedom Square Estimated /F-ratio of Variance Coefficient 


a. Between Machines, 

Within Days 3(44 -—1) =9 18.2 rox + ow? 4.14% Go? = 3.45%) = 3.38 
b. Within Machines 

= Residual (3)(4)(4 — 1) = 36 4.4 oy? Zi Gu? = 4.408. = 3.82** 
c. Departmental 

Overall, Gy +6u2 

Within Days i s pe wi .- + Ge = 7.85 G0 = 5.09** 


d. Between Days 24.0 rng +roy? 1.32 Ga = 0.36 fa = 1.09*** 
+0 w? 


*Significant at 5% or better. #Not Significant. 

** Compare with corresponding éw and #0 obtained by the range method. 

*** Of ancillary interest only for the present, since day-to-day variation is generally controlled by means of 
control charts. 3 

Note that r and n were defined in Table II. 


6. A TypicaL CasgE History 


The typical case history, in Table V, may be used to illustrate how the method 
of Variations Flow Analysis works. For each processing stage shown, 20 samples 
were tested, with each sample consisting of four specimens per machine. Ac- 
cordingly, it is found from Table III that any F > 1.097 indicates a significant 
excess, at the 5 per cent level, of overall variation over within-machine variation. 

For the first process, carding, the 6) of 5.2 is significantly higher than the #,, 
of 2.8, the F-ratio being greater than 1.091. Based on Flow Rule 3, this means 
that there are excessive differences in average level between the machines in the 
carding process. A subsequent investigation indicated that off-standard trumpet 
sizes and gearing were responsible for these differences. 

The next process is drawing, in which a blending or doubling of sixteen strands 
of sliver takes place. Since generally several cards feed a particular drawing 
frame, the 6, of carding may be used to calculate the expected within-machine 
variation in drawing, based on Equation (2). Thus: 


Vw = 52+ V16= 1.8. (5) 


Table V shows that the actual within-machine variation in drawing is only 
0.7 in terms of per cent variation coefficient, which is considerably below the 1.3 
theoretically expected from the formula. Accordingly, one must suspect that 
the normal pattern of flow, usually expected from Flow Rule, 1 may have been 
upset by channelling, resulting in the pattern observed under the special case 
when Flow Rule 4 applies. A subsequent check revealed these suspicions to be 
true. There were actually two makes of cards, Saco-Lowell and Platt, which 
were located in two sections, A and B, of the card room. The Saco-Lowell cards 
in section A had a slightly higher draft, thus delivering somewhat lighter stock 
than the Platt cards in section B. Correspondingly, the Platt cards with a some- 
what lower draft, were delivering heavier stock. This difference in stock weight 
had contributed to the high departmental overall variation on the cards, which 
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was noted above. Now, when pushing the cans of sliver from the cards to draw- 
ing, the tender would naturally push them the shortest distance. And, while 


TABLE V 
Variations in Stock Weight Found in a Typical Mill 


Variation Coefficient, Per Cent 


Processing Stage Within-Machines Departmental Overall Modified F-Ratio 
or Department (Om) (60) (60/bm) 


Carding : . 1.857* 
Drawing . i 2.571* 
Roving ‘ ; 1.400* 
Spinning : e 1.063 


* Significant at 5 per cent or better, since Fmoa , forr = 4, k = 20 is 1.097. 


there was a certain amount of randomization of the cans, in general the relatively 
light stock from the Saco-Lowell cards would go to the group of drawing frames 
closest to section A, while the somewhat heavier stock from the Platt cards 
would go to the group of drawing frames closest to section B. Within each 
drawing frame, the variations input was thus either from Saco-Lowell cards or 
from Platt cards. Following Flow Rule 2, each drawing frame would thus reflect 
the group-overall variation fed from either section A or section B of the cards, 
which was less than the departmental overall variation, for the simple reason 
that the group-overall for each section did not include the effect of difference in 
average level between the two groups. Thus the full amount of departmental 
overall variation was not fed to any one drawing frame, resulting in an actual 
within-machine variation which was less than would have been expected under 
fully random flow of stock to each drawing frame. 

One might suspect from this example that channelling is actually beneficial, 
since the variation observed within the machines was less than expected under 
random flow. Actually, however, only a temporary advantage is observed. The 
differences between the two card groups continue to be reflected in the form of 
differences in weight of the stock, and will now show up in a high overall variation 
in drawing. Examination of Table V shows that the actual overall variation is 
indeed high, with a coefficient of 1.8 per cent. The channelled flow has prevented 
the lighter stock from the Saco-Lowell cards to meet the heavier stock from 
the Platt cards in the blending operation of the drawing frames. This has mini- 
mized the effectiveness of blending (in particular, the effectiveness of the standard 
error of the mean law to operate) and has resulted in a higher overall variation 
than would have been obtained under fully random flow of stock. In particular, 
without channelling it might have been expected that the within-machine 
variation in drawing would have been only slightly above the 1.3 per cent 
predicted from Flow Rule 1 and Equation (2). The overall variation, in the 
absence of differences between drawing frames, would have been at a correspond- 
ing level. The actual overall variation, at 1.8 per cent, must thus be considered 
unduly high as a result of the channelled flow. It is interesting to note that the 
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mill subsequently shifted the carding machines, alternating the two makes 
among adjoining floor positions, so that in the future there would be relatively 
automatic randomization of the flow from the two types of cards the drawing 
frames. This insured better blending in drawing and, combined with a correction 
of off-standard trumpets and géaring on several of the cards, resulted in sub- 
stantial improvements in card and drawing sliver uniformity of weight. 

Continuing the analysis, it will be noted that in roving the within-machine 
variation exhibits a coefficient of 2.0 per cent, which is generally in accordance 
with expectation, based on Flow Rule 2, and a small allowance for the effect of 
drafting in roving. The overall variation of 2.8 per cent is significantly in excess 
of the within-machine coefficient..Applying Flow Rule 3, a check of equipment 
was made, which revealed differences in gearing, roll diameters and cone belt 
starting positions among the roving frames, to be responsible for this excess 
variation. 

In the final yarn making process, spinning, the within-machine variation with 
a coefficient of 3.2 per cent is consistent with the variations input from roving, 
departmental overall, under Flow Rule 2. The overall variation in spinning is not 
significantly higher than the within-machine variation, indicating that there are 
no harmful between-machine differences between the spinning frames. 

The Variations Flow Analysis thus demonstrated places in carding, drawing 
and roving where excessive variations were being induced into the processing. 
Moreover, from an examination of the Flow Rules, the nature of this excessive 
variation is known. In this particular case history, the subsequent corrective 
actions taken, as outlined above, resulted in an eventual reduction in the overall 
variation coefficient of the yarn from a relatively medium 3.4 per cent to a 
relatively good 2.6 per cent. This is only one example, however. The validity of 
Variations Flow Analysis has now been confirmed in experience in over a hundred 
plants. The importance of the technique is enhanced by the fact that, accompany- 
ing reductions in variation of linear density or stock weight, there are improved 
processing conditions, in the form of lowered strand breakage rates in spinning 


the yarn and in subsequent weaving, and improved appearance quality of the 
more uniform product obtained. 


7. DERIVATION OF MopIFIED F-TABLEs 


The Modified F ratio values have been computed from Merrington and 
Thompson’s tables of the distribution of Snedecor’s F, (11), with reference to the 
concept of departmental overall variance developed in the following. In particular, 
let x,; represent a test result for specimen 7 from machine j, with 


Xi; = A oe 6; + €5; (6) 


Here A is an overall average, the second term denotes machine differentials 
independently normally distributed in the universe of machines with zero mean, 
and the final term denotes individual specimen variations combined with errors 
of testing and measurement, independently normally distributed with zero mean 
and identical standard deviation, o, for each. If a random sample of r items is 
taken from a machine picked at random from the universe of machines, then as 
shown for example by Cochran (1), the sampling variance in the mean of these 








ces 
ely 
ing 
ion 
ub- 


ine 
nce 
t of 
xeSS 
ent 
belt 


Cess 


vith 
ing, 
not 
} are 


ving 
sing. 
sive 
tive 
erall 
to a 
ty of 
dred 
any- 
oved 
ning 
f the 


and 
0 the 
cular, 


(6) 


ntials 
mean, 
errors 
mean 
ms is 
1en as 
these 







VARIATIONS FLOW ANALYSIS 385 


r items will be 





2 2 2 
05,;=09 + 0,/r 


(7) 
Because of the equivalence of o and v, we may develop the Modified F values 


from the more customary c, noting that v can be substituted. The Modified F 
is formed from the expression: 























Proalk, 1) = 60+./b (8) 
Where: 
dove = (65 + 6)! (9) 
Now, Snedecor’s F may be written as (1, 10): 
F(k, r) = (ro5 + 0%)/0% (10) 
From (8) and (10) it is apparent that: 
Proalk,t) = V1 + (F — /r (11) 


Where F is an abbreviation for F(k, r). In calculating F,,.4 in Table III, the 
use of ranges was considered, by making an appropriate reduction of approxi- 
mately ten per cent in the Degrees of Freedom, based on Duncan’s tabulated 
data (3). 

An interesting use of the cross-range of averages is also possible, based on the 
work of Patnaik (9). In particular, for k samples of c machines, each selected at 
random from a large universe of machines, the standard deviation in the mean 
can be estimated from the average of the cross-range, R;; , divided by the 
appropriate value of d* , for k samples of c each, tabulated by David (2) and 
Duncan (3) and based on values due to Hartley (6) and Pearson (10, 11). 

An estimate of the standard deviation within machines, o, , can be obtained 
from the average within-machine range, R, , divided by Duncan’s d# , for k 
samples of r each. Substituting in (7) and transposing, we can now write the 
estimate of the variance of machine differentials: 


= 1/rir(Rz;,/d8)” — (R./dd)") (12) 


From which we ula by reference to Equation (9), an estimate of the 
overall variance: 



























G46 = (Rz;/dh)’ + (1 — 1/7)(R./db)” (13) 


This permits use of the cross-range of the averages to estimate ¢, . Substituting 
this in (8), we would have: 


Froa(k, t) = [(Rz;/dh)* + (1 — 1/7)(R./dé)/(R./dts (14) 


Where r represents the modified degrees of freedom paired with d% in Duncan’s 
table (3). Reducing (14) we find: 


F(k, 1) = r(Rz;/dh)’/(R./db)” (15) 


This depends on the usual F table and is analogous to the form given by 
David (2). It is presented here to show the various equivalent values of relation- 
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ship of F and F,..4 . Equation (15) may, of course be used in Variations Flow 
Analysis, but introduces complications with regard to the determination of 
Degrees of Freedom. Moreover, it does not permit direct comparisons of the 
important estimate ¢),, against ¢, , without backtracking to R,; . 
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TECHNOMETRICS Aucust, 1960 


A Semigraphical Method for the Analysis of 


Complex Problems 


Epcar ANDERSON* 


Curator of Useful Plants, Missouri Botanical Garden 
and Engelmann Professor of Botany, Washington University (St. Louis) 


Recognizing associations between large numbers of variables is a problem en- 
countered in all the sciences. For this reason the Editor felt that the following article 
by Dr. Edgar E. Anderson, which appéared in the Proceedings of the National Academy 
of Sciences. Vol. 13, pp. 923-27, 1957, would be of interest to the readers of Techno- 
metrics. The article is republished with the kind permission of Dr. Anderson and of 
Dr. Wendell M. Stanley, the Editor of the Proceedings of the National Academy of 
Sciences. 


Science and technology have many problems difficult to record, measure, or 
analyze because many variables or complexes of variables have an important 
bearing on the end result. It is difficult to measure and to get in mind the various 
inter-relationships of several sets of facts for each of a considerable number of 
individuals. This is particularly true when some or all of the basic data are 
complex patterns, difficult or impossible to code in numbers. Such a, problem is 
the relation (As/Az) between species differences and individual differences 
between plants or animals from which they were ultimately derived.’ 

After various attempts to estimate the relation (As/Az) efficiently, a simple 
semigraphical method was gradually evolved. It is now being used on data 
from various groups of organisms in the analysis of variation in natural popu- 
lations. With the encouragement of several mathematicians, particularly E. B. 
Wilson and J. W. Tukey, I have recently been exploring its general applicability. 
On a trial basis it has given promising results with data from physiology, mor- 
phology, psychology, and linguistics. 

It is planned ultimately to illustrate the method with a series of joint papers, 
showing its applications to various fields. In the meantime (since only its special- 
ized use in analyzing population variation has been published) a short, general- 
ized account seems desirable. 

The method had its inception in an attempt to measure variation in and 
between maize fields in Mexico.” Two easily measured features of the maize 
ear—row number and kernel width—were plotted on the X- and Y-axes of 
a Cartesian grid. The individual dots of the resulting scatter diagram were 
replaced by precise but semipictorial glyphs recording kernel shape and kernel 
texture. In analyzing variation in natural populations, these more or less pictoral- 


* T am indebted to the Guggenheim Foundation and to Princeton University for fellowships 
which made it possible to explore these possibilities. 
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ized glyphs were replaced by generalized glyphs,” on each of which the variation 
of seven or more variables could be recorded. Subsequently’’* these glyphs were 
modified to permit more efficient scanning of the “‘pictoriaiized scatter diagrams” 
in which they were used. It has since been realized that the glyphs (used quite 
apart from the scatter diagrams) can be arranged according to sequences in 
time, space, or development. They can be averaged or transferred from a scatter 
diagram to a map by making appropriate modifications. It seems likely that, 
with their continued employment in logical operations, various other uses for 
them may be found. It therefore becomes convenient to have a generalized name 
for the glyphs apart from their use in pictorialized scatter diagrams, and they 
are accordingly named metroglyphs. 

The use of metroglyphs is illustrated in a simple generalized way in Figure I. 
Given (upper left) four individuals—1, 2, 3, and 4—each of which has been 
measured (or ranked in one of three grades) for each of five qualities—A, B, C, 
D, and E. The key for coding this problem is shown in Figure I, upper right, and 
the resulting metroglyphs for the four individuals are shown at right center. 
Each of the qualities is diagramed by a ray, the rays for any one quality having 
the same position on each glyph. For each quality a long ray indicates that the 
individual is high for that quality, a short ray that it has a medium value; no 
ray at that position indicates that the individual in question has a low value. 
In other words, each glyph with no rays indicated low values for all five variables. 
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It is for this reason that the rays are placed at easily visualized and remembered 
points. ‘There are three rays from the upper pole of the glyph, two others at its 
equator to right and left. It is then easy to train one’s self to see the glyph as a 
whole and to take in all the information from it. The operator’s eyes look at the 
glyph of number 1, and almost immediately his mind interprets it as low A, 
medium B, high C, low D, low Z. He will also see, almost at a flash, that the 
total magnitude of the glyph, considering all five qualities at once and giving 
them equal weight, is about one-third of the way from minimum total (low A, 
low B, low C, low D, low E) to the maximum total value (high A, high B, 
high C, high D, high E). 

By assigning values of 2 for each long ray and a value of 1 for each short ray, 
it is possible to assign scores to each glyph and to arrange them as a frequency 
diagram of an index (Fig. I, lower right). The minimum total value per glyph 
(a glyph with no rays) is 0; the maximum total value (a glyph with five long rays) 
is 10. If there are logical or practical reasons for so doing, the index may be 
weighted, assigning different values to different rays. 

In working with these glyphs, it has gradually been realized that they are a 
device for helping the eye to aid the mind. For maximum efficiency it is best to 
make some concessions to the eye. The mind can be trained to adjust itself more 
readily than can the eye. For instance, the medium values were originally scored 
with a ray which was made exactly half as long as the ray for the high values. It 
was learned from experience that the eye could do a better job, particularly in 
problems involving large numbers of glyphs, if, in scanning the diagram, it never 
had to stop to decide whether a ray was medium or long. Therefore, it is better 
to have the long rays at the very least almost three time the length of the short 
ones. For the same reason, in all problems involving much scanning, it is better 
to code the data so simply that the eye can be trained to take in each glyph 
almost instantaneously. Examples of this principle will be discussed in order— 
quartiles, number of rays, ray positions, accessory data, glyph variations. 

Quartiles—Since the analysis of variability by medians and quartiles is a well- 
established technique, it was originally attempted to set the rays for such 
analysis, coding them as follows: no ray, values below lower quartile; very 
short ray, values between the lower and medium quartile; medium-short ray, 
values between the medium and the upper quartile; long ray, values above the 
upper quartile. It was found that, with any considerable number of glyphs, 
even a trained eye could not analyze the interaction of variables with this system 
as well as with the simpler one outlined above (but see below under “Glyph 
Variations’). 

Number of rays—Though, theoretically, there might be a very large number of 
rays, the eye works most efficiently with no more than three to seven, depending 
on the eye. If the problem has many important variables, it is better to take the 
four or five which are most closely correlated and turn them into an index, as 
described above. This index can then be coded on a ray in position C or it may 
be used on the X- or Y- axis of a pictorialized scatter diagram. 

Ray positions—Originally, rays were used around all sides of each glyph. This 
confuses the eye. By slanting them all in approximately the same direction, it is 
possible to scan large complicated diagrams more efficiently. By using these 
standardized positions in problem after problem, there is considerable carry- 
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over from one problem to another. The data can be interpreted more efficiently 
by an experienced analyzer if he is using the standard with which his eyes are 
already familiar. 

Accessory Data—Many biologists, when first using this method, are apt to put 
on various kinds of accessory data as little projections or one shape or another. 
These all hinder the eye enough to reduce efficiency. If there are two kinds of 
individuals which one wishes to distinguish, one kind may be given a white dot 
and the other one a black dot. Other accessory information which one wishes to 
try out in one way and another is best left managed by tiny pencil-point dots 
placed immediately to the left or right of the base of each glyph. They will 
scarcely interfere with the scanning, and their association with the other 
variables can be worked out by the eye. . 

Glyph Variations—In so far as possible, it is better to vary the glyph from 
problem to problem by adapting the coding to the problem as much as possible 
and to change the glyph system as little as possible. For instance, if quartile 
data need to be worked with, one can diagram three variables in quartiles— 
one variable at the left equatorial position, one at the right, and one at the 
upper pole. For each variable the four quarters of the data are diagramed as 
follows: no ray, one short ray, two short rays, two long rays. It has been found 
that to the eye the jump from no ray to one short ray is about as significant as 
that from two short rays to two long rays. 

If one needs to diagram certain variables in more than three or four categories, 
he can use even more rays. A simple example is shown in Figure II, where two 
variables are coded in values from 1 to 10 each—one variable on rays slanting 
to the right, the other on rays slanting to the left. 

For problems in which the glyphs are merely employed in serial order, it is 
not so important to keep the rays down to two lengths. They may then be 
diagramed proportionate to the ranging of each variable or to the actual 
measured values. 

When going back and forth from glyphs used in serial order or on a map, to 
glyphs used on a Cartesian grid, two rays may be qualities diagrammed on X- 
and Y-axes, or measures on X- and Y-axes may be replaced by rays of appropri- 
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Ficure II. Top: variables A and B are coded in values from 1 to 10 each. Lower left: the 
records of five individuals with respect to A and B. Lower right: the corresponding metro- 
glyphs for each of the five individuals. 
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ate lengths. In Figure I at the lower left, individuals 1, 2, 3, and 4 are transferred 
to a Cartesian grid. Qualities A and H (each measured on a scale of 1-10, using 
the original values given in parentheses in the upper-left-hand-corner of Fig. 
I), are employed to locate each glyph on the scatter diagram. It will be seen 
that these glyphs are the same as those in the figure to the right, except that the 
ray at position A has been replaced by the position of the entire glyph on the 
Y-axis, while the ray at position EK has been replaced by the position of the 
glyph on the X-axis. It will be noted that, as far as one can judge from four 
individuals, variables B and C are associated with each other, as are B and D; 
that B, C, and D, separately and in combination, are associated strongly with 
variable A and slightly with variable E. 

In attempting to work out complexes of related qualities, the analysis is 
facilitated if the ray lengths are coded in such a way that all the extreme values 
characteristic of one complex are assigned long rays and those characteristic 
of the other are assigned no rays. For example, in studying hydridization between 
two subspecies of Campsis, one of the subspecies had a short tube, a wide limb, 
and much red in the flower; the other had a long tube, a small limb, and little 
red. Redness and limb width were coded with long rays for much red and for 
wide limbs, tube length was coded in reverse with a long ray for short tubes. 
This meant that those hybrids closely resembling the other parent as a rayless 
dot. 
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Epiror’s NOTE: 


Dr. Anderson requested that the following note be appended: 

“The analytical method published here in condensed form is part of a general 
investigation in the wide field between natural history and mathematics. It is 
an attempt to use the wide observational basis of natural history to help pure 
and applied mathematics. Natural history is a limiting case insofar as raw 
scientific data is concerned and is a most likely source of useful hypothesis 
finding techniques. No other technical papers have even approached this funde- 
mental aspect of the problem though a short book to be published by the Uni- 
versity of Michigan press is almost ready for their editor. I have however 
published a general and, I think, useful essay as part of the jubilee volume of 
the American Journal of Botany, Vol. 43, No. 10, 882-9, Dec. 1956.” 

In commenting on this latter article, Dr. Anderson went on to say, ‘Before 
you read it remember that when I send it to most statisticians I put at the top 
in quotations, ‘I come not to destroy, but to fulfill.” The article is a little too 
sharp in tone, but I believe after three years of discussion with first rate mathe- 
maticians that, though the tone is sharp, the points are well taken.” 
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Inter-plant Storage in Continuous Manufacturing 


H. D. -MiLter 


Statistical Laboratory, University of Cambridge 


A continuously operating plant A feeds its product into storage which in turn 
feeds a second continuously operating plant B. The loss of production due to empty 
or full storage depends on the storage capacity. This dependence is examined by 
treating the fluctuating storage content as a finite Markov chain, as in P. A. P. 
Moran’s theory of the Finite Dam. (Gani (1957).) 

The theory is applied to a chemical process where there are twin plants feeding 
into storage and twin plants being fed by the storage. 


1. INTRODUCTION 


The situation considered here is that found in a continuous manufacturing 
process where material, produced continuously in one or more plants, is held in 
storage and then passed on to another plant or plants for further processing. 
For simplicity in discussion a system will be considered consisting of one 
“upstream” plant, say plant A, which feeds its product continuously into 
storage and one “downstream” plant, say plant B into which material from the 
storage is continuously fed. The system is assumed to be self-contained in that 
all the material produced by plant A is used by plant B and plant B uses only 
material produced by plant A and none from any other source. 

The reason for having storage at all is that both plants are subject to shut- 
downs, and the storage allows one plant to continue production while the other 
plant is shut down. Production may be lost on plant B when the storage is 
empty and on plant A when the storage is full. The problem to be examined is 
how the capacity of the storage affects loss of production. The methods are 
based on P. A. P. Moran’s theory of the finite dam as described in a paper 
by Gani (1957). 

There are in general two types of shut-down on a plant. Firstly there are the 
scheduled or planned shut-downs for cleaning and maintenance which have 
known duration and occur on predetermined dates, and secondly there are 
unscheduled shut-downs which are unforeseen and occur for a variety of reasons, 
usually some technical breakdown on the plant. Shut-downs are not necessarily 
total, though they usually are for planned maintenance. When there is no 
scheduled shut-down in progress, production may be variable and during an 
unscheduled shut-down production may cease altogether or it may be reduced 
considerably. 

It will be assumed that scheduled shut-downs occur regularly for both plants 
A and B and that there is a time cycle for the system of length T days, say, in 
which the sequence of scheduled shut-downs is repeated. This means that the 
system is in the same state with regard to scheduled shut-downs on day / as it 


393 





394 H. D. MILLER 


is on day ¢ + T. For example, plant A may have planned maintenance of 3 days 
duration every 4 weeks and plant B may have 5 days every 6 weeks; in this 
case the cycle is of length 12 weeks (7’ = 84 days), and during each 12 week 
period the same sequence of scheduled shut-downs occurs at the same times. 


2. THEORY 


A discrete-time model will be used throughout and for convenience the unit 
of time will be taken to be 1 day. The storage material will also be assumed to 
flow in discrete units, and the storage capacity is an integral number N of these 
units. S, denotes the number of units of material in storage at the end of the 
day ¢ and is an integer in the range 0 < S, < N. S, will be called the storage 
level. 

Let a, (a, = 0, 1, 2, --- a) denote the number of units of material produced 
by plant A on day #, and let b,(b, = 0, 1, 2, --- b) denote the number of units 
used by plant B on day #; a is the maximum possible production of plant A and 
b the maximum possible usage of plant B. a, and b, are discrete random variables 
and they satisfy the relation 


S, = Si1ta, — b, (1) 


a, and 6, are not necessarily stochastically independent of each other nor of S, 
since, for example, if the storage is nearly empty at the end of day ¢ — 1 it may 
be necessary on day ¢ to regulate deliberately the usage of plant B according 
to the production of plant A, or to shut plant B down completely. 

Further, the probability distributions of a, and b, depend on # since, for 
example, the distribution of a, will be different on days when plant A is under- 
going scheduled maintenance from days when it is not. It is assumed that these 
distributions are known; they will in general be based on plant data. From 
these distributions it is possible to obtain the distribution of S, conditional on 
S,-, . In other words, the behaviour of the storage level from day to day is 
governed by the production of plant A and the usage of plant B. 


Let p(t, ) _ Pr (S, ne j/Si-n = 1), (i, i = 0, 1, 2, Carat N) 
Then 


pdt, j) = Pr(a, — b, = j — 1/8). = 7) (2) 


The p,(z, j) form a (N + 1) X (N + 1) stochastic matrix P, of transition prob- 
abilities which govern the change in the storage level in the time interval 
(t — 1, #), and (2) expresses the relation between these probabilities and the 
production and usage distributions of plants A and B. 

Thus S, is a finite Markov chain (for a description of the theory of finite 
transition matrices and Markov chains see Feller (1957)). The transition matrix 
P, is not constant with respect to ¢ since it depends on whether or not the plants 
are undergoing scheduled shut-downs. However, P, is periodic in ¢ with period 
T since scheduled shut-downs occur periodically with period 7. Thus 


P, = Pir 


Consider the storage level on the same day of successive cycles, i.e. S, , 
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Sisr, Seaer, *** - This defines a Markov chain with constant transition matrix 
given by the product 


Q, _ P.4i:P ise wa P;P,P, vo P, (3) 


The transition matrix of S,,,7 is constant with respect to n, not ¢. Hence for 
each day ¢ of the cycle, the storage level S, will have a limiting probability 
distribution denoted by the row vector 


a = (Lor 9 Vig » Tap 5 °° ty) (4) 
where x, is given by 
i, = x,Q, 


N 
ys ti, = 1 (5) 
i=1 
Thus eventually, after a large number of cycles the system settles down to a 
steady state irrespective of the initial conditions and the steady state is defined 
by the limiting probabilities of (5). 
In computation, in order to obtain x, for ¢ = 1, 2, 3, --- it is necessary to 
obtain only Q, (and hence x,) where 


Q, = P,P. -++ Pp 


From the definition of P, , x, satisfies the recursion 
xX, = x,-.P, 


Let p(a,/t) (a, = 0, 1, 2, --+ a; 7 = 0,1, 2, --- N) be the probability distri- 
bution of a, conditional on i = S,_, , but unconditional on b, , and let g(b,/2) 
(b, = 0, 1, 2, --- bt = 0, 1, 2, --- N) be the probability distribution of b, con- 
ditional on 7, but unconditional on a, . In the limiting steady state, the expected 
number of units produced on day ¢ by plant A is given by 


a N 


C¢. = z a,2;,.-1p(a,/t). 


a;=0 i= 


Summed over the days of the cycle, 


7 
Ay = > « 


t=1 


is the limiting expected production per cycle of plant A. For plant B the 
limiting expected usage on day / is given by 


d, = > > b.2;,+-1p(a,/1) 


b,=0 i=0 
and the limiting expected usage per cycle by 
td 
By = dX d, 
t= 


Ultimately, the quantity which it is desired to calculate is the loss of pro- 
duction in each plant due to storage. In order to determine this, consider a 
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storage of infinite capacity. Since, in this case, there is always material in store 
and always storage space for more material the production of plant A and the 
usage of plant B are not in any way determined by the storage level or by each 
other. Thus the probability distributions p(a,/7) and q(b,./z) will both be un- 
conditional and can be written p(a,) and 9(b,) respectively. The total expected 
production and usage by cycle are given by 


Aw > d ap(a) 


t=1 a¢=0 


7 b 


B. = ae a b.q(b:) 
t=1 bsg =0 
The average daily production and usage can be defined by A./T and B./T 
respectively. 
The expected loss of production per cycle of plant A due to storage of capacity 
N is given by . 
Aw — Ay 


The loss of production measured on a time scale is called the outage and thus 
for plant A the expected number of days outage due to storage is 


Ly = (A. = Ay)/(A./T) 


This can be described as the loss of production time with the rate of production 
taken as that with infinite storage. Similarly for _ B, the expected number 
of days outage due to storage is 


My = (B. — By)/(B./T) 


Ly and My may be calculated for different values of N and thus the effect 
of storage capacity on loss of production may be determined. 

In principle, the extension of the theory to a system consisting of several 
parallel upstream plants, a single storage and several parallel downstream 
plants can be carried out simply since essentially the problem is to set up the 
stochastic transition matrix P, which is periodic in ¢. Several plants can be 
regarded mathematically as one plant and the end result would be to calculate 
the total loss of production due to finite storage. Unless the plants are of identical 
size, a difficulty might arise in distributing the loss of production among the 
plants. In the numerical application below, there are two identical upstream 
plants and two identical downstream plants. 

In any application numerical computation will be heavy and will, in general, 
require the use of an electronic computer. The main part of the computation is 
to obtain the product matrix Q,) , since the matrices P, may be large. 


3. APPLICATION TO CONTINUOUS CHEMICAL PROCESS 


In a particular continuous chemical process, the system consisted originally 
of one upstream plant A, , the storage containers and one downstream plant B, . 
It was proposed to build a twin upstream plant A, and a twin downstream plant 
B, , and the problem arose of how much extra storage capacity to provide. 

The theory will be applied to the self-contained system consisting of two 
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identical upstream plants A, and A, which both feed into the storage, which 
in turn feeds two identical downstream plants B, and B,. Further assumptions 
are as follows:— 


(a) 


The unit on the discrete time axis is 1 day. 


(b) The unit of stored material is one full day’s production of an A-plant 


(c) 


(e) 


and this is equivalent to one full day’s usage of a B-plant. 

Unscheduled shut-downs occur randomly and last for exactly 1 day at a 
time. For an A-plant, probability of an unscheduled shut-down on any 
day is 0.06 and for a B-plant 0.04. These figures are derived from plant 
data. 

The year is made up of 52 weeks and this is taken to be the length of the 
cycle for scheduled shut-downs, i.e. T = 364 days. The scheduled shut- 
downs, which are total shut-downs, are arranged as follows:— 

Plants A, and A, : Each plant has a scheduled shut-down of 7 days 
duration. They occur alternately for plants A, and A, during weeks 17, 
34, and 51 of each year, i.e. approximately 180° out of phase. Each plant 
has two shut-downs one year and one the next. This does not matter from 
the point of view of storage since the plants are identical and it is sufficient 
to know that during weeks 17, 34 and 51 of each year one or the other of 
the A-plants shuts down. 


Planis B, and B, : Each plant has a scheduled shut-down of 4 days 
duration every 8 weeks and the times at which they occur for plant B, 
are 180° out of phase with those of B, . Thus plant B, shuts down on the 
last 4 days of week 8, week 16, week 32, etc., and plant B, shuts down 
on the last 4 days of week 4, week 12, week 20, etc. With this arrangement 
plant B, , in the 1st year, will have 6 shut-downs and plant B, will have 7 
and the reverse in the 2nd year. However, from the point of view of the 
storage this is immaterial since plants B, and B, are identical; it is suf- 
ficient to know that one or other plant shuts down on the last 4 days of 
every 4th week. 

The storage capacity N is a whole number of days production. In the 
numerical calculation, N was given the values 9, 11 and 13. 


The 364 days of the cycle can be divided into non-overlapping sets as follows: 


(i) 


I, is the set of values of ¢ such that plant A, or plant A, is undergoing a 
scheduled shut-down on day #, i.e. J, contains the values 113, 114, --- 
119; 232, 233, --- 238; 351, 352, --- 357. 
Tz is the set of values of ¢ such that plant B, or piant B, is undergoing a 
scheduled shut-down on day /, i.e. J, contains the values 25, 26, 27, 28; 
53, 54, 55, 56; --- 361, 362, 363, 364. 
I, contains the remaining values of ¢ and on the corresponding days no 
plants are undergoing scheduled shut-downs. In order to construct the 
transition matrices P, let 
a, = Pr (plant A, shuts down on day ?#) 
8, = Pr (plant B, shuts down on day ?) 


and let a, and 8, be similarly defined for plants A, and B, respectively. 
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With the exception of boundary conditions, the storage level can undergo 
five possible changes on any day: 
(i) The storage level rises by 2 units if both B-plants are shut down and both 
A-plants are operating. The probability of this event is 
(l on o,)(1 7 a2) B82 


(ii) The storage level rises by one unit if both A-plants are operating and 
only one B-plant is closed, or if only one A-plant is operating and both 
B-plants are closed. The probability of this event is 


(1 = a,)(1 re a2) {8,(1 rat Bz) + B.(1 —— B:)} + BiB2{o,(1 ot Qt) + a(1 wi a) } 


(iii) By reasoning similar to (i) and (ii) above, the probability of no change in 
storage level is 


@1028;82 + (1 — a,)(1 — a2)(1 — B,)(1 — Be) 
+ {a,(1 és; @2) + a(1 a) } {6,(1 eS Bo) + 6,(1 o 8,)} 
(iv) The probability of a fall of 1 unit in the storage level is 
aya {B,(1 oo Bo) + B2(1 = B,)} + (1 ed B,)(1 a Be) {oi (1 —s @2) + a(1 aa a)} 
(v) The probability of a fall of 2 units in the storage level is 
a,ar2(1 on B,)(1 bag Bo) 
Thus for 2 < i < N — 2 we have the following transition probabilities: 


pit, i + 2) = (1 — a,)(L — a2) BiB, 
pit, t + 1) = (1 — a,)(1 — a) {Bi(1 — 62) + B2(1 — B,)} 
+ BiB2{ai(1 — a2) + a(1 — a,)} 
Dilt, t) = ay028:82 + (1 — a4)(1 — a2)(1 — 6i)(1 — Bs) 
+ {ai(1 — a2) + aa(l — o)}{6i(1 — 62) + Bl — B:)} (6) 
pili, i — 1) = arae{Bi(1 — 62) + Bl — B:)} 
+ (1 — B:)(1 — B2)fai(1 — a2) + ao(1 — a:)} 
pilt, i — 2) = ayar(1 — B,)(1 — By) 
p.(i,j) = 0 otherwise 


The values of p,(z, j) fort = 0, 1, N — 1, N are determined by the rules 
governing the operation of the plants when the storage level is very high or very 
low. In this particular problem it is assumed that it is possible for the A-plants 
to feed the B-plants directly, if necessary; thus if S,_, = 0 or 1, no B-plant is 
shut down on day # for lack of material unless the usage required exceeds the 
sum of the material in storage and the production of the A-plants. 


p.(1, 0) = pt, t — 1) + pi, i — 2) (7) 
p.(0, 0) — p(t, 1) + p(t, t- 1) + pili, t= 2) 
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where the terms on the r.h.s. are defined in (6). For other values of j, p,(0, j) 
and p,(1, j) will be obtained from (6). By similar considerations at the other 
boundary 


p(N a 3, N) = p(t, t+ 1) +> p.(t, t+ 2) 
p(n, N) _ p(t, 1) + p(t, a + 1) + pili, t+ 2) 
where the terms on the r.h.s. are defined in (6). For other values of j, p,.(N — 1, j) 
and p,(N, j) will be obtained from (6). 
The transition matrices P, are now completely defined by (6), (7) and (8). 
P, will be constant within each of the sets J, , J, and J, . Substituting the actual 
values of a; , a , 8; , 82 (omitting the subscript in p,(, j)): 


(i) For é in the set J, , i.e. when no scheduled shut-downs are in progress 
a, = a, = 0.06, B, = B. = 0.04 


(8) 


and 
(0, 0) = 0.9035 
p(1, 0) = 0.1075 
p(N — 1, N) = 0.0695 


p(N, N) = 0.8925 
For other values of 7, j 


p(t, i + 2) = 0.0014 
p(t,i + 1) = 0.0681 
pi, t) = 0.8230 
p(t,t — 1) = 0.1042 
p(t, 7 — 2) = 0.0033 
pti, 7) = 0 otherwise 


lities: 


Let P,; = E in this case. 


(ii) Foréin the set J, , i.e. when one of the A-plants is undergoing a scheduled 
shut-down, 


a, = Za a= 0.06, (or Q, = 0.06, a = 1) B, = B. = 0.04 
and 


p(0,0) = 0.9985 
p(1,0) = 0.9985 


p(N, N) = 0.0738 
For other values of i, j 


pli, i+ 1) = 0.0015 
p(t, t) = 0.0723 
p(t, i — 1) = 0.8709 
p(i,i — 2) = 0.0553 
p(t, 7) = 0 otherwise 
Let P, = F in this case. 
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(iii) For ¢ in the set J, , i.e. when one of the B-plants is undergoing a scheduled 
shut-down, 


om =o,=0.06, 6,=1, B,=0.04 (or 6, = 0.04, & = 1) 
and 
p(0, 0) = 0.1119 
p(N — 1, N) = 0.8881 
p(N, N) = 0.9965 
For other values of 7, j 
p(i, i + 2) = 0.0353 
pli, i+ 1) = 0.8528 
p(i, i) = 0.1084 
p(i,t — 1) = 0.0035 
p(t, 7) = 0 otherwise 


Let P, = G in this case. 

A different method of calculating the loss of production due to storage was 
used in the actual computations from that outlined in the theory. It is necessary 
to find the probability of a plant shutting down due to storage. Suppose for 
example that S,_, = 1. 

If both A-plants are operating on day ¢, and if both B-plants would otherwise 
be operating, then one B-plant will have to shut down due to lack of material. 
The probability of this event is p,(z, 7 — 2) where this term is defined in (6). By 
similar considerations the remaining probabilities of shut-downs due to storage 
may be found. Thus we have the following results where the p(z, j) are defined 
in (6). 


Pr (one A-plant shuts down due to storage/S,_, = N) = p(t,7 + 1) 

Pr (both A-plants shut down due to storage/S,_, = N) = p(t, i + 2) 

Pr (one A-plant shuts down due to storage/S,_; = N — 1) = p(t,i+ 2) (9) 
Pr (one B-plant shuts down due to storage/S,_, = 0) = p(t, 7 — 1) 

Pr (both B-plants shuts down due to storage/S,_, = 0) = p(t, 7 — 2) 

Pr (one B-plant shuts down due to storage/S,_, = 1) = p(i, i — 2) 


In all other cases the probability of shut-down due to storage is zero. 
Now, using these probabilities, the expected outage of the A-plants due to 
storage is given by 


364 


Zz {Xn-1,2-1:(t, a+ 2) + Xn, +-1p:(1, a+ 1) + 2a ,1-1p.(t, ++ 2)} (10) 


t=1 


and that for the B-plants by 


364 


> {x1,e-1pi(8, € - 2) + Lo,e-1Pi(t, 4 -1)+ 22,1-1p.(t, 4 = 2)} (11) 
t=1 
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TABLE 1 
Average Annual Outage Per Plant Due to Storage (Days) 


N = Storage Capacity (Days) 


9 11 13 


A-plant 9.2 
0.8 


5 
+ 


8 2 
0. 1 


8 
B-plant 0. 
where x, = (Xo: , Xi » *** , ye) is the vector of limiting probabilities for the 
storage level S, and Xo = Xzes « 
In order to obtain x, it is necessary to compute the product matrix Q, defined 
in (3). In this case Qo (= Qses) is given by 


Q a P,P, eer Pres ‘aa (E“G*)‘F’E"G‘(E"G')’E’ F'E'’G‘ (E"G')*E“F’E’G* 
The vector xX, is given by x» = X,Q,) or by any row of the limit 
lim Qo 
and for computational purposes (on an electronic computer) x, was evaluated 


by taking the first row of Q$* , which for practical purposes is the same as the 
limit of Qo . For ¢ > 0, x, is given by the recurrence relation 


xX, = X,-:P, 


The values Xo: , Zit » Xv-1,, and Xy, were obtained from the computer and by 
substituting these in (10) and (11) the required quantities were obtained. 

The numerical values of the expression (10) and (11) are tabulated in Table 
1. They are divided by 2 to give the average annual outage per plant due to 
storage. 

A salient feature of the results is the disparity between the A-outage (due to 
full storage) and the B-outage (due to empty storage). This is due to the dis- 
parity in the original data. The B-plants have more combined scheduled and 
unscheduled shut-downs than the A-plants and thus the storage tends to be more 
full than empty. The A-plants are producing more than the B-plants are con- 
suming and hence require to be shut down more often on account of storage in 
order to redress the balance. 


REFERENCES 


Fetter, W. (1957). “An Introduction to Probability Theory and its Applications” Vol., 
i, 2nd Edition, New York: Wiley pp. 338-396. 
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Estimation of the Parameters of Two Parameter 
Exponential Distributions from 


Censored Samples 


BENJAMIN EPSTEIN 
Wayne State University and Stanford University 


In life testing a two parameter exponential distribution is often required to 
provide an adequate representation of the observed results. Sufficient statistics and 
interval estimates for the parameters are given for the case of censored data. Numeri- 
cal examples are provided. 


It has been found in many problems of life testing that there are occasions 
when a two parameter exponential distribution is more appropriate for fitting 
life test data than is a one parameter distribution. By a two parameter ex- 
ponential distribution we mean a density function f(z; 6, A) such that 


(1) f(z; 0, A) = ot oeaze toe 


A can be thought of as a guarantee period within which no failures can occur or 
as a minimum life. If A = 0, equation (1) reduces to the one parameter ex- 
ponential. 

Problem: A sample of n items is drawn at random from a population whose 
p.f.d. is described by (1). The experiment is terminated as soon as the first r 
failure times z, < t. < --- < 2, become available. Items which fail are not 
replaced. Give ‘‘best”’ estimates for the unknown parameters A and 0. 

Solution: It can be shown that x, , the time to observe the first failure, and 
T(x, — 2), the total life observed in the interval (x, , z,), are mutually independ- 
ent and jointly sufficient for estimating A and 6. Sufficiency means roughly that 
x, and T(x, — 2x,) jointly contain all of the relevant information for estimating 
A and @ that can be obtained from the first r failure times, 7, < x, < --- <2,. 


Best estimates for A and @ in the sense that they are unbiased and minimum 
variance are given by 


(2) Pages’ 


n 
and 


(3) § = T(z, — 2,)/(r — 1), 
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where 
T(x, — t1) = (n — 1)(@2 — 2) + (n — 2)(@s — 22) + --> 
+ —r+ 1)@, — z,-,) 
—@—-)Dat+tatat::: tait@—rt iz,. 
It is often convenient in (3) and (4) to use the fact that 
T(x, — 2) = T@,) — Ta) = T@,) — nx, , 


(4) 


where 
T(z,) = > x,+(n—nrz,. 


Confidence limits for @ are easy to obtain from the fact that 2(r — 1)6/@ = 
2T (x, — 2;)/@ is distributed as x°(2r — 2). Thus for r > 2, one and two-sided 
100(1 — a) percent confidence intervals for @ are given respectively by 


© Gaaac) « ee.) 


( Ar — 1)6 Ar — 1)6 ) 
Xa/2(2r — 2)’ Xi-a/2(2r — 2) 


(2Fz. — 2) 2T (x, — 2:) ). 
Xav2(2r — 2)’ Xi-a/a(2r — 2) 


To find confidence intervals on A we use the fact that h = 2n(x, — A)/@ and 
v = 2(r — 1)6/0 are independent and distributed respectively as x’(2) and 
x’ (2r — 2). Therefore the ratio W = (2r — 2)h/2v is distributed according to the 
F distribution with 2 degrees of freedom in the numerator and (2r — 2) degrees 
of freedom in the denominator (denoted as F'(2, 2r — 2)). From the F table one 
can find for selected values of y and r the constants w, such that Prob (0 < W < 
wy) = ¥ (i.e., w, is the upper y percent point of the F(2, 2r — 2) distribution). 
From the definition of W a 100 y percent confidence interval for A is given by 


% (<, #8») = («, - wee = 20, 


m’ n(r — 1) 


It has been shown in [1] that this is the shortest 1007 percent confidence interval 
in the class of intervals being used. 

Remark 1: Tables of the F distribution are useful in finding the confidence 
interval (7). However, it may happen that for the particular values of r and 
in question, the value w, is not tabulated. In this case, a result. proved in [2] 
is useful. This result is that Z = W/r — 1 has the density function (r — 1)/ 
(z + 1)’,z > 0. From this it follows that 


(8) PrO<Z<z,)=y7 
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is satisfied by 
1 1/r-1 
(9) 2, = ( ) — 1. 


But since 
n(x, — A) 
~ (r— 16 


it follows that the desired 100y percent confidence interval for A is given by 
ids as Tr, — 
ir (ay OS a fag Mme) 


n n 


Since z, = w,/r — 1 we have, of course, the same confidence interval as before. 
However, z, is computable for any r and any +. 


Remark 2: 
= n(x, — A) = n(x, — A) 
((—1)6 T(x, — x) 


can be interpreted as the ratio between the total life between time A and 2 , 
the time when the first failure occurs, and the total life between xz, and x, in- 
clusive. Clearly one wants to reject the hypothesis that A = 0, if nz,/T(x, — 2) 
is too large. It should be noted that under the hypothesis that A = 0, nz,/ 
(T(x, — 2)/(r — 1)] = nz,/6 is distributed as F(2, 2r — 2). 

Remark 3: Either formula (7) or its equivalent, formula (10), can be used to 
test whether rv not A differs significantly from zero. If x, — w, 6/n or equivalently 
a, — 2, T(x, — 2,)/n are > O, then A is significantly greater than zero at the 
(1 — y) level. 

Remark 4: The 100y percent confidence interval for A can be interpreted as a 
one-sided tolerance interval. More precisely we can make the ststement that 
all items live longer than x, — z,6(r — 1)/n (or 2, — 2,T(x, — 2,)/n) with 
confidence y. 100y percent of these assertions will be correct. 


NUMERICAL EXAMPLE 


1. 20 items are placed on test. Testing is terminated after one has observed 
the first 10 failures. Suppose that the first failure occurs 520 hours after the 
experiment starts. The total life observed between the time when the first 
failure occurs and the time when the tenth failure occurs is 12000 item hours. 
Assuming that the underlying distribution is exponential, do the following: 


(i) Test whether A > 0 at the .05 level. 


(ii) If A > 0, find the shortest 95% confidence interval for A and an un- 
biased estimate for A. 


(iii) Find an unbiased estimate for 6 and one and two-sided confidence 
intervals for @. 


Solution: (i) Suppose that A = 0, then nz,/[T (x10 — 2,)/9] is distributed as 
F(2, 18). From the data 


NX, 
T (210 ee x,)/9 Pi 12000/9 om 
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But the upper 5% point for F(2, 18) is 3.55. Hence A is significantly different 
from zero on the .05 level. As a matter of fact, since the upper .5% point for 
F(2, 18) is 7.21 and the upper .1% point for F(2, 18) is 10.39, A is significantly 
different from zero at between the .001 and .005 levels. 

(ii) From the data 6 = T(x.. — 2,)/9 = 12000/9 = 1333. Hence an unbiased 
estimate for A is given by z, — 6/n = 520 — 1333/20 = 520 — 67.7 = 452.3. 

The shortest 95% confidence interval can be computed from (7). Since 
W.os = 3.55 in this case, the interval is 


(3.55)(12000) 
(s20 - ~ 920) , 520) = (283, 520). 


(iii) In (ii) we saw that the best estimate for @ is given by 6 = 1333. From 
(5) and (6), best one and two-sided 95% confidence intervals for @ are given by 


(0, ) = (HOP, @) = B31, ) 


24000 24000) _ (24000 24000) _ 
(2s, ? x"o7(18) 31.53 ’ 7.906 (761, 3036) 
respectively. 


Remark: The tolerance interval in (2) can also be interpreted as follows: We 
are 95% confident of the assertion that all items survive 283 hours. 


REFERENCES 
1. B. Epsre1n anv M. SoseEt, “Some Theorems Relevant to Life Testing from an Exponential 
Distribution,’’ Annals of Math. Stat. 26, 373-381, 1954. 


2. B. Epstein, “Simple Estimators of the Parameters of Exponential Distributions when 
Samples are Censored,”’ Annals of Math. Stat. 8, 15-25, 1956. 
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PROGRAM OF THE 
120TH ANNUAL MEETING OF THE 
AMERICAN STATISTICAL ASSOCIATION 


SECTION ON PHYSICAL AND ENGINEERING SCIENCES 


Tuesday, August 23 (cont.)—4:00 PM-5:50 PM 
Spectral Analysis of Time Series 


Chairman: Lucten M. LeCam, University of California, Berkeley 


Papers: ‘“Mathematical Considerations in the Estimation of Spectra” by E. 
Parzen, Stanford University 
“General Considerations in the Estimation of Spectra” by G. M. 
Jenkins, Stanford University 


Discussion: (To be announced) 
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Industrial Applications 
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ance of Screened Lots” by Max Woods, Stanford University 
“A Problem in Plumbing” by W. S. Connor, Research Triangle Institute 
and N. C. Severo, University of Buffalo 
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Chairman: D. B. OwEN, Sandia Corporation 
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Leone, Case Institute of Technology 
“The Construction of Orthogonal Latin Squares of Non-Prime Power 
and Related Orthogonal Arrays, Using a Computer” by I. M. Chakra- 
varti, University of North Carolina 
“Computer Techniques in the Statistical Investigation of Non-Linear 
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Parametric Relations”, T. I. Peterson, International Business Machines 
Corporation. 


4:00 PM-5:50 PM 
Statistics in Precision Measurement 


Chairman: CHURCHILL EIsENHART, National Bureau of Standards 


Papers: ‘‘Design and Control of Mass Calibration Series” by H. 8S. Peiser, 
National Bureau of Standards 
“An Analysis of the Accumulated Error In A Hierarchy of Calibrations” 
by E. L. Crow, National Bureau of Standards 
“The Assignment of Uncertainties to Experimental Values” by A. G. 
MeNish, National Bureau of Standards 
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Reliability 
Chairman: GEorGE L. LEVENBACH, Bell Laboratories 


Papers: “Analysis of Failure Data on Components” by W.S. Connor and W. T. 
Wells, Research Triangle Institute. 
“Investigation of the Robustness of Statistical Life Testing Procedures” 
by Marvin Zelen and Marcy C. Dannemiller, National Bureau of 
Standards. 
“Decision Rules in Reliability Testing’? by David 8S. Stoller, Rand 
Corporation 


Discussion: Philip Brown, U. 8. Navy Department 
Frank Proschan, Sylvania Electric Products, Inc. 


8:30 AM-10:20 AM 
Components of Variance 


Chairman: R. L. ANpErson, North Carolina State College 


Papers: “Optimal Designs to Estimate the Parameters of Variance Component 
Models: One-Way Classification” by 8. Lee Crump, Brooks Air Force 
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“On = Forms of Expectations of Mean Squares” by George Zyskind, 
Iowa State University. 
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Statistical Association 





onent 
Force 


onent 
rolina 


skind, 


erican 


NOTICES 


THE SEVENTH INTERNATIONAL MEETING 
OF THE INSTITUTE OF MANDGEMENT SCIENCES 


will be held on October 20-22, 1960 at the Hotel Roosevelt, New York City. 
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—Applications and Tools of Management Science 
—Use of Computers in Simulation 


The agenda also includes contributed and invited papers. 
Registration fees: Members $15, Non-members $20, Students $5 
Non-members who apply for TIMS membership during this meeting may 
apply a portion of this fee toward membership dues. 
For further information write: Mr. James Townsend, 30 E. 42nd St., New York, 
17, New York 
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PREPARATION OF MANUSCRIPTS 


Manuseripts should be submitted to the office of the editor: J. S. Hunter, 
Mathematics Research Center, U. S. Army; The University of Wisconsin; 
Madison, Wisconsin. Each manuscript should be typewritten, double spaced, 
with wide margins at sides, top, and bottom. The original should be submitted 
with two additional copies, on paper that will take corrections. Dittoed or 
mimeographed papers are acceptable only if completely legible. Footnotes 
should be avoided and replaced by remarks in the text, or placed in an appendix. 
Preferably, references in the manuscript should appear as (Jones, A. B., 1958), 
and again later in alphabetical order in a list of references. Alternatively, fer 
ences may be numbered, e.g. [1], as they appear in the manuscript and be listed 
in this sequence in the list of references. In the reference list, each reference 
should contain, in the order indicated, the name and initials of the author 
followed by those of the co-authors, date of publication, title of reference, 
source, volume number and page. References to books should include pub- 
lisher’s name and location. 

Figures, charts, and diagrams should be sjahiialanatiy drawn on plain white 
paper or tracing cloth in black India ink twice the size they are to be printed. 
A full page diagram, in print, measures 7.25 X 4.75 inches. 

As far as possible, formulas should be typewritten and symbols not available 
on a typewriter carefully inserted in ink. Authors are asked to keep in mind the 
typographical difficulties of complicated mathematical formulae. The difference 
between capital and lower-case letters should be clearly shown; care should be 
nakte to avoid confusion between such pairs as zero and the letter 0, the numeral 
1 and the letter /, numeral 1 used as superscript and prime (’), alpha and a, kappa 
and k, mu and u, nu and », eta and n, ete. Subscripts or superscripts should be 
clearly below or above the line. Bars above groups of letters (e.g., log x) and 
underlined letters (e.g., x) are difficult to print and should be avoided. Symbols 
are automatically italicized by the printer and should not be underlined on 
manuscripts. Boldface letters may be indicated by underlining with a wavy line 
on the manuscript; boldface subscripts and superscripts are not available. 
Complicated exponentials should be represented with the symbol exp particu- 
larly when appearing in the text, that is, 


exp [(a* + 6”)'”] should be used in place of e****”*”* . 
In writing square roots the fractional exponent is preferable to the radical sign. 
Fractions in the body of the text (and when possible in displayed expressions) 


and fractions occurring in the numerators or denominators of fractions are 
preferably written with the solidus; thus 


a+b 
(4 + 0)/(¢ + ) rather than =~ 3° 


Authors will ordinarily receive only galley proofs. Fifty offprints without 
covers will be furnished free. Costs for additional] reprints and covers can be 
furnished on request. 
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